|
|
The breadth of XML support within the Microsoft Office system facilitates the modern work environment, where copy / paste, manual editing and continuous entry of the same information is a thing of the past. In addition to document formats, Office supports many XML-based data exchange methods. Word, Excel and InfoPath all support the use of XML Web services to facilitate connections to external content. Word Smart Document Solutions enable organizations to deploy sophisticated document templates that combine external data sources and programming to guide users through the process of authoring highly structured or complex documents.
SpreadsheetML Example SpreadsheetML was designed so that super short tag names could be used for any tag or attribute that appears frequently. Elements that may only appear once in a file often have longer tag names, since their size doesn't have nearly the same impact. Microsoft has established naming conventions for abbreviations shared across all three formats. Currently, most frequently used tag names are no more than a couple characters in length. Imagine if longer, more descriptive names are used so each tag was 5 times larger. Consider the following small, simple table that looks like this: Short tag example: XML Excel 2007 SpreadsheetML <sheetData><row r="1" spans="1:3"><c r="A1"><v>1</v></c><c r="B1"><v>2</v></c><c r="C1"><v>3</v></c></row><row r="2" spans="1:3"><c r="A2"><v>4</v></c><c r="B2"><v>5</v></c><c r="C2"><v>6</v></c></row></sheetData> Long tag example: OD - OpenOffice Calc 2.0.2 <table:table table:name="Sheet1" table:style-name="ta1" table:print="false"><table:table-column table:style-name="co1" table:number-columns-repeated="3" table:default-cell-style-name="Default"/><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="1"><text:p>1</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="2"><text:p>2</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="3"><text:p>3</text:p></table:table-cell></table:table-row><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="4"><text:p>4</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="5"><text:p>5</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="6"><text:p>6</text:p></table:table-cell></table:table-row></table:table>
SpreadsheetML Example SpreadsheetML was designed so that super short tag names could be used for any tag or attribute that appears frequently. Elements that may only appear once in a file often have longer tag names, since their size doesn't have nearly the same impact. Microsoft has established naming conventions for abbreviations shared across all three formats. Currently, most frequently used tag names are no more than a couple characters in length. Imagine if longer, more descriptive names are used so each tag was 5 times larger. Consider the following small, simple table that looks like this: Short tag example: XML Excel 2007 SpreadsheetML <sheetData><row r="1" spans="1:3"><c r="A1"><v>1</v></c><c r="B1"><v>2</v></c><c r="C1"><v>3</v></c></row><row r="2" spans="1:3"><c r="A2"><v>4</v></c><c r="B2"><v>5</v></c><c r="C2"><v>6</v></c></row></sheetData> Long tag example: OD - OpenOffice Calc 2.0.2 <table:table table:name="Sheet1" table:style-name="ta1" table:print="false"><table:table-column table:style-name="co1" table:number-columns-repeated="3" table:default-cell-style-name="Default"/><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="1"><text:p>1</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="2"><text:p>2</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="3"><text:p>3</text:p></table:table-cell></table:table-row><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="4"><text:p>4</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="5"><text:p>5</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="6"><text:p>6</text:p></table:table-cell></table:table-row></table:table>
To facilitate the interoperability of documents, and enable the exchange of documents across systems and applications, the 2007 Microsoft® Office system introduces the new default XML file formats for Microsoft Word text processing, Excel® spreadsheet, and PowerPoint® presentation graphics programs. These new Office Open XML formats change the way developers approach solutions based on Office documents. The Role of XML in Office Interoperability by design – the 2007 Office System is designed to enable interoperability of documents and information between users, programs, systems and applications. The 2007 Microsoft® Office system is designed to achieve industry alignment using standardized technologies. enable data interoperability between documents, applications and systems. capture and reuse information to and from many data sources. build intelligent applications that improve data context and quality.
One of the important things to remember in this description is that for the end user – the person creating documents, presentations and spreadsheets – the file format change will have minimal impact. Users will still create, edit, open, save and print documents in the same way they do today. For end users, their use of desktop files will not change. Where the biggest change is evident is what is happening “under the hood” of this new file format. Internally, the different types of data within the file are segmented and stored separately. The “file container” or the Word document, for example, is still just a file. The difference is how the data is structured internally within the file. As you can see from the example, items like comments, charts, embedded code and custom XML are all stored within separate components within the document. Each component (and the entire file itself) is compressed, and fully described in XML syntax. Because of this efficient file structure, developers have much improved access to the specific types of content within a file. As a result of the compression, the file sizes are much smaller. Applications and systems could use this modular format architecture to access and modify specific file contents. For example, a developer could write a tool to search a document repository and ensure the correct logo is used on all corporate letterhead; or, search a document repository and remove tracked changes from documents that are meant to be published. Another interesting feature is the isolation of macros and embedded code. If desired, a file can be prohibited from executing this code. In fact, a separate file format enabling the use of embedded code will be created to cleanly separate files that are and are not allowed to execute macros. Another key benefit of the file format architecture is data recovery. Because of the segmented data storage, the corruption or damaging of a single part of the document will not prevent the other parts (or the remainder of the document) from being opened. This is important for helping to ensure the integrity of the documents that are created.
Open XML Formats Specifications The Open XML Formats specifications are written by the standards organization Ecma International. The Office Open XML format documentation is available in PDF format in five separate parts. The five parts are: Part 1 - Fundamentals Part 2 - Open Packaging Conventions Part 3 - Primer Part 4 - Markup Language Reference Part 5 - Markup Compatibility and Extensibility The Open XML Formats specifications are freely available under a royalty-free license located on the Ecma website: http://www.ecma-international.org/news/TC45_current_work/TC45-2006-50_final_draft.htm
Note that standardized styling can be modified by simply modifying the style that is applied: a new look can be implemented immediately at any time.
This type of processing could occur on the client or the server. Note applications in both directions: content coming INTO your organization (for viruses, inappropriate language, whatever) content going OUT of your organization (for confidential information, etc.)
Instructor Notes Remember the slide library demo? Well you can build these types of solutions yourself. Think of a document that has been created by the user. And then uploaded to the server where you unpackage the document and parse out all the information you need using a combination of the packaging api and some xml manipulation. The result can for example be stored in a database or a LOB system.
The most common Open XML development scenario: creating a document on the server. The client for example goes to a web application and indicates a number of fragments, parts that he or she wants to see assembled in a document. In our business layer, we typically will make use of some kind of template to give us a headstart. Opening that template using the packaging api and then adding, configuring, populating the package with the choices of the user. Closing the package and then Returning it as a full word document to the client. Question: How many of you think that this is an interesting scenario? -> tell them that they will see demo in a minute but also will work out a sample in the first lab
This concept applies to both of the previously covered Open XML development scenarios: content within the document can be tagged with custom business semantics. This allows the meaning of the data to be defined separately from its presentation, allowing for more robust solutions and simpler programming. Note that custom tagging can be done with or without an associated schema, and with or without a custom XML data store.
This is something that you can do in all Office apps that support the new file format. Developers can package data in XML format embedded in the office document. Microsoft is also following this approach with the properties of the document and the WSS meta-data. If a relationship is defined, the XML data is accessible from within either VBA or .NET code. You can use your XML coding skills since the XML is made available using the familiar DOM.
OpenXML основные идеи Владимир Габриель Microsoft
-3- Форматы документов К счастью ИТ позволяет создавать документы просто и в огромных количествах К несчастью Бывает трудно обмениваться такими документами – т.к. они несовместимы К тому же Различные приложения используют одну и ту-же информацию различным образом Sampling of Documents on the Web
Работа с документами Старый подход Бумажные документы В электронном виде документы обрабатываются до момента их печати Вынужденные совещания для обсуждения документов Процесс управления бумагой Двоичные форматы, низкая возможность повторного использования контента Новый подход Печать документов – редкое явление не завязанное на бизнес-процесс Электронные форматы документов Электронные обсуждения документов, синхронные и асинхронные Электронное управление документами XML формат – просто искать, находить и повторно использовать
СОВМЕСТИМОСТЬ С массивом существующих документов С базой существующих приложений ОТКРЫТОСТЬ и ДОСТУПНОСТЬ Спецификации Реализаций ЭФФЕКТИВНОСТЬ Хранения Работы Цели OpenXML
Роль OpenXML
Эволюционный путь Office Office 2000 Early Innovation XML Document Properties Office 97 Existing binary file formats designed in 1994, launched in Office 97 Office XP First XML Formats Spreadsheet XML Office 2003 Breakthrough XML Support WordProcessingML, SpreadsheetML Custom-defined schema 2007 Office system New XML-based Formats XML File format Default XML PowerPoint Format
Возьмем простой пример: SpreadsheetML пример
SpreadsheetML пример Короткие тэги: XML Excel 2007 SpreadsheetML <sheetData><row r="1" spans="1:3"><c r="A1"><v>1</v></c><c r="B1"><v>2</v></c><c r="C1"><v>3</v></c></row><row r="2" spans="1:3"><c r="A2"><v>4</v></c><c r="B2"><v>5</v></c><c r="C2"><v>6</v></c></row></sheetData> Длинные тэги: OD - OpenOffice Calc 2.0.2 <table:table table:name="Sheet1" table:style-name="ta1" table:print="false"><table:table-column table:style-name="co1" table:number-columns-repeated="3" table:default-cell-style-name="Default"/><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="1"><text:p>1</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="2"><text:p>2</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="3"><text:p>3</text:p></table:table-cell></table:table-row><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="4"><text:p>4</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="5"><text:p>5</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="6"><text:p>6</text:p></table:table-cell></table:table-row></table:table>
Возможность организовать взаимодействие различных систем На основе стандарта С использованием стандартных инструментов С использованием мощных прикладных систем Shared service oriented architecture (fx http, XML, SOAP, WSDL, UDDI) Итог Дело
Open XML архитектура форматов Файл Office С точки зрения действий пользователя – обычный файл Внутреннее использование Различные данные хранятся упакованными в ZIP Содержимое недоступно пользователям, пока не распакуют Приложения и люди могут использовать части документов не используя офисных приложений Нарушение целостности не всегда критичны
Ecma Office Open XML WordprocessingML SpreadsheetML PresentationML ZIP XML + Unicode DrawingML Content Types Custom XML Bibliography Языки разметки Relationships Metadata Digital Signatures VML (legacy) Equations Соглашение о фомате пакета Базовые технологии Словарные языки
Итого Open XML Formats Specifications Написан Ecma International Доступен в виде 5-ти документов Part 1 - Fundamentals Part 2 - Open Packaging Conventions Part 3 - Primer Part 4 - Markup Language Reference Part 5 - Markup Compatibility and Extensibility Свободно доступен и не требует отчислений за испольщование http://www.ecma-international.org/news/TC45_current_work/TC45-2006-50_final_draft.htm
Вы не пользователь Office 2007 Open XML compatibility pack Все Office начиная с 2000
Сценарий: Изменение стиля Применение стандартов форматирования на документы организации
Сценарий: Проверка содержимого Удаление конфиденциальной информации из исходящих документов Удаление макросов и\или другого неподходящего содержимого из входящих документов
Back-end system (LOB/CRM/etc.) Сценарий: Загрузка документов Пользователь создает отчет о командировке в Excel для обработки кадровой системой Open XML Processing Authoring environment (Microsoft Office, etc.)
Сценарий: Сборка документов Создание отчета по продажам из системы продаж Web client or rich client allows user to select or enter content criteria
Сценарий: Custom XML Разметка документа custom тэгами для дальнейшей обработки. Authoring environment Open XML Processing
Custom XML хранилище Customer-defined XML хранится отдельно от остальных частей Можно хранить любой XML Свойства документа WSS мета данные Custom XML (с или без XML схемы) XML данные доступны как редактируемое дерево (используя обычный DOM) в Word Внешние приложения (client/server) могут обрабатывать и изменять эти данные
Open XML инструменты
Приложения для OpenXML Office – 2000, XP, 2003, 2007 iWork ’08 http://www.apple.com/iwork/ Apple iPhone Corel PerfectOffice Novell version OpenOffice.org, Gnumeric
Дополнительно www.microsoft.com/office/preview www.OpenXMLDeveloper.org www.Ecmainternational.org Blogs.msdn.com/brian_jones msdn.microsoft.com/office/xml www.microsoft.com/technet/prodtechnol/office www.microsoft.com/resources/casestudies
Copyright ©2006 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
Summary: Владимир Габриель, Платформа 2008, https://platforma2008.ru
| URL: |
No comments posted yet
Comments