OpenXML основные идеи

0

No comments posted yet

Comments

Slide 7

The breadth of XML support within the Microsoft Office system facilitates the modern work environment, where copy / paste, manual editing and continuous entry of the same information is a thing of the past. In addition to document formats, Office supports many XML-based data exchange methods. Word, Excel and InfoPath all support the use of XML Web services to facilitate connections to external content. Word Smart Document Solutions enable organizations to deploy sophisticated document templates that combine external data sources and programming to guide users through the process of authoring highly structured or complex documents.

Slide 8

SpreadsheetML Example SpreadsheetML was designed so that super short tag names could be used for any tag or attribute that appears frequently. Elements that may only appear once in a file often have longer tag names, since their size doesn't have nearly the same impact. Microsoft has established naming conventions for abbreviations shared across all three formats. Currently, most frequently used tag names are no more than a couple characters in length. Imagine if longer, more descriptive names are used so each tag was 5 times larger. Consider the following small, simple table that looks like this: Short tag example: XML Excel 2007 SpreadsheetML <sheetData><row r="1" spans="1:3"><c r="A1"><v>1</v></c><c r="B1"><v>2</v></c><c r="C1"><v>3</v></c></row><row r="2" spans="1:3"><c r="A2"><v>4</v></c><c r="B2"><v>5</v></c><c r="C2"><v>6</v></c></row></sheetData> Long tag example: OD - OpenOffice Calc 2.0.2 <table:table table:name="Sheet1" table:style-name="ta1" table:print="false"><table:table-column table:style-name="co1" table:number-columns-repeated="3" table:default-cell-style-name="Default"/><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="1"><text:p>1</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="2"><text:p>2</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="3"><text:p>3</text:p></table:table-cell></table:table-row><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="4"><text:p>4</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="5"><text:p>5</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="6"><text:p>6</text:p></table:table-cell></table:table-row></table:table>

Slide 9

SpreadsheetML Example SpreadsheetML was designed so that super short tag names could be used for any tag or attribute that appears frequently. Elements that may only appear once in a file often have longer tag names, since their size doesn't have nearly the same impact. Microsoft has established naming conventions for abbreviations shared across all three formats. Currently, most frequently used tag names are no more than a couple characters in length. Imagine if longer, more descriptive names are used so each tag was 5 times larger. Consider the following small, simple table that looks like this: Short tag example: XML Excel 2007 SpreadsheetML <sheetData><row r="1" spans="1:3"><c r="A1"><v>1</v></c><c r="B1"><v>2</v></c><c r="C1"><v>3</v></c></row><row r="2" spans="1:3"><c r="A2"><v>4</v></c><c r="B2"><v>5</v></c><c r="C2"><v>6</v></c></row></sheetData> Long tag example: OD - OpenOffice Calc 2.0.2 <table:table table:name="Sheet1" table:style-name="ta1" table:print="false"><table:table-column table:style-name="co1" table:number-columns-repeated="3" table:default-cell-style-name="Default"/><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="1"><text:p>1</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="2"><text:p>2</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="3"><text:p>3</text:p></table:table-cell></table:table-row><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="4"><text:p>4</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="5"><text:p>5</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="6"><text:p>6</text:p></table:table-cell></table:table-row></table:table>

Slide 10

To facilitate the interoperability of documents, and enable the exchange of documents across systems and applications, the 2007 Microsoft® Office system introduces the new default XML file formats for Microsoft Word text processing, Excel® spreadsheet, and PowerPoint® presentation graphics programs. These new Office Open XML formats change the way developers approach solutions based on Office documents. The Role of XML in Office Interoperability by design – the 2007 Office System is designed to enable interoperability of documents and information between users, programs, systems and applications. The 2007 Microsoft® Office system is designed to achieve industry alignment using standardized technologies. enable data interoperability between documents, applications and systems. capture and reuse information to and from many data sources. build intelligent applications that improve data context and quality.

Slide 11

One of the important things to remember in this description is that for the end user – the person creating documents, presentations and spreadsheets – the file format change will have minimal impact. Users will still create, edit, open, save and print documents in the same way they do today. For end users, their use of desktop files will not change. Where the biggest change is evident is what is happening “under the hood” of this new file format. Internally, the different types of data within the file are segmented and stored separately. The “file container” or the Word document, for example, is still just a file. The difference is how the data is structured internally within the file. As you can see from the example, items like comments, charts, embedded code and custom XML are all stored within separate components within the document. Each component (and the entire file itself) is compressed, and fully described in XML syntax. Because of this efficient file structure, developers have much improved access to the specific types of content within a file. As a result of the compression, the file sizes are much smaller. Applications and systems could use this modular format architecture to access and modify specific file contents. For example, a developer could write a tool to search a document repository and ensure the correct logo is used on all corporate letterhead; or, search a document repository and remove tracked changes from documents that are meant to be published. Another interesting feature is the isolation of macros and embedded code. If desired, a file can be prohibited from executing this code. In fact, a separate file format enabling the use of embedded code will be created to cleanly separate files that are and are not allowed to execute macros. Another key benefit of the file format architecture is data recovery. Because of the segmented data storage, the corruption or damaging of a single part of the document will not prevent the other parts (or the remainder of the document) from being opened. This is important for helping to ensure the integrity of the documents that are created.

Slide 13

Open XML Formats Specifications The Open XML Formats specifications are written by the standards organization Ecma International. The Office Open XML format documentation is available in PDF format in five separate parts. The five parts are: Part 1 - Fundamentals Part 2 - Open Packaging Conventions Part 3 - Primer Part 4 - Markup Language Reference Part 5 - Markup Compatibility and Extensibility The Open XML Formats specifications are freely available under a royalty-free license located on the Ecma website: http://www.ecma-international.org/news/TC45_current_work/TC45-2006-50_final_draft.htm

Slide 15

Note that standardized styling can be modified by simply modifying the style that is applied: a new look can be implemented immediately at any time.

Slide 16

This type of processing could occur on the client or the server. Note applications in both directions: content coming INTO your organization (for viruses, inappropriate language, whatever) content going OUT of your organization (for confidential information, etc.)

Slide 17

Instructor Notes Remember the slide library demo? Well you can build these types of solutions yourself. Think of a document that has been created by the user. And then uploaded to the server where you unpackage the document and parse out all the information you need using a combination of the packaging api and some xml manipulation. The result can for example be stored in a database or a LOB system.

Slide 18

The most common Open XML development scenario: creating a document on the server. The client for example goes to a web application and indicates a number of fragments, parts that he or she wants to see assembled in a document. In our business layer, we typically will make use of some kind of template to give us a headstart. Opening that template using the packaging api and then adding, configuring, populating the package with the choices of the user. Closing the package and then Returning it as a full word document to the client. Question: How many of you think that this is an interesting scenario? -> tell them that they will see demo in a minute but also will work out a sample in the first lab

Slide 19

This concept applies to both of the previously covered Open XML development scenarios: content within the document can be tagged with custom business semantics. This allows the meaning of the data to be defined separately from its presentation, allowing for more robust solutions and simpler programming. Note that custom tagging can be done with or without an associated schema, and with or without a custom XML data store.

Slide 20

This is something that you can do in all Office apps that support the new file format. Developers can package data in XML format embedded in the office document. Microsoft is also following this approach with the properties of the document and the WSS meta-data. If a relationship is defined, the XML data is accessible from within either VBA or .NET code. You can use your XML coding skills since the XML is made available using the familiar DOM.

Slide 2

OpenXML основные идеи Владимир Габриель Microsoft

Slide 3

-3- Форматы документов К счастью ИТ позволяет создавать документы просто и в огромных количествах К несчастью Бывает трудно обмениваться такими документами – т.к. они несовместимы К тому же Различные приложения используют одну и ту-же информацию различным образом Sampling of Documents on the Web

Slide 4

Работа с документами Старый подход Бумажные документы В электронном виде документы обрабатываются до момента их печати Вынужденные совещания для обсуждения документов Процесс управления бумагой Двоичные форматы, низкая возможность повторного использования контента Новый подход Печать документов – редкое явление не завязанное на бизнес-процесс Электронные форматы документов Электронные обсуждения документов, синхронные и асинхронные Электронное управление документами XML формат – просто искать, находить и повторно использовать

Slide 5

СОВМЕСТИМОСТЬ С массивом существующих документов С базой существующих приложений ОТКРЫТОСТЬ и ДОСТУПНОСТЬ Спецификации Реализаций ЭФФЕКТИВНОСТЬ Хранения Работы Цели OpenXML

Slide 6

Роль OpenXML

Slide 7

Эволюционный путь Office Office 2000 Early Innovation XML Document Properties Office 97 Existing binary file formats designed in 1994, launched in Office 97 Office XP First XML Formats Spreadsheet XML Office 2003 Breakthrough XML Support WordProcessingML, SpreadsheetML Custom-defined schema 2007 Office system New XML-based Formats XML File format Default XML PowerPoint Format

Slide 8

Возьмем простой пример: SpreadsheetML пример

Slide 9

SpreadsheetML пример Короткие тэги: XML Excel 2007 SpreadsheetML <sheetData><row r="1" spans="1:3"><c r="A1"><v>1</v></c><c r="B1"><v>2</v></c><c r="C1"><v>3</v></c></row><row r="2" spans="1:3"><c r="A2"><v>4</v></c><c r="B2"><v>5</v></c><c r="C2"><v>6</v></c></row></sheetData> Длинные тэги: OD - OpenOffice Calc 2.0.2 <table:table table:name="Sheet1" table:style-name="ta1" table:print="false"><table:table-column table:style-name="co1" table:number-columns-repeated="3" table:default-cell-style-name="Default"/><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="1"><text:p>1</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="2"><text:p>2</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="3"><text:p>3</text:p></table:table-cell></table:table-row><table:table-row table:style-name="ro1"><table:table-cell office:value-type="float" office:value="4"><text:p>4</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="5"><text:p>5</text:p></table:table-cell><table:table-cell office:value-type="float" office:value="6"><text:p>6</text:p></table:table-cell></table:table-row></table:table>

Slide 10

Возможность организовать взаимодействие различных систем На основе стандарта С использованием стандартных инструментов С использованием мощных прикладных систем Shared service oriented architecture (fx http, XML, SOAP, WSDL, UDDI) Итог Дело

Slide 11

Open XML архитектура форматов Файл Office С точки зрения действий пользователя – обычный файл Внутреннее использование Различные данные хранятся упакованными в ZIP Содержимое недоступно пользователям, пока не распакуют Приложения и люди могут использовать части документов не используя офисных приложений Нарушение целостности не всегда критичны

Slide 12

Ecma Office Open XML WordprocessingML SpreadsheetML PresentationML ZIP XML + Unicode DrawingML Content Types Custom XML Bibliography Языки разметки Relationships Metadata Digital Signatures VML (legacy) Equations Соглашение о фомате пакета Базовые технологии Словарные языки

Slide 13

Итого Open XML Formats Specifications Написан Ecma International Доступен в виде 5-ти документов Part 1 - Fundamentals Part 2 - Open Packaging Conventions Part 3 - Primer Part 4 - Markup Language Reference Part 5 - Markup Compatibility and Extensibility Свободно доступен и не требует отчислений за испольщование http://www.ecma-international.org/news/TC45_current_work/TC45-2006-50_final_draft.htm

Slide 14

Вы не пользователь Office 2007 Open XML compatibility pack Все Office начиная с 2000

Slide 15

Сценарий: Изменение стиля Применение стандартов форматирования на документы организации

Slide 16

Сценарий: Проверка содержимого Удаление конфиденциальной информации из исходящих документов Удаление макросов и\или другого неподходящего содержимого из входящих документов

Slide 17

Back-end system (LOB/CRM/etc.) Сценарий: Загрузка документов Пользователь создает отчет о командировке в Excel для обработки кадровой системой Open XML Processing Authoring environment (Microsoft Office, etc.)

Slide 18

Сценарий: Сборка документов Создание отчета по продажам из системы продаж Web client or rich client allows user to select or enter content criteria

Slide 19

Сценарий: Custom XML Разметка документа custom тэгами для дальнейшей обработки. Authoring environment Open XML Processing

Slide 20

Custom XML хранилище Customer-defined XML хранится отдельно от остальных частей Можно хранить любой XML Свойства документа WSS мета данные Custom XML (с или без XML схемы) XML данные доступны как редактируемое дерево (используя обычный DOM) в Word Внешние приложения (client/server) могут обрабатывать и изменять эти данные

Slide 21

Open XML инструменты

Slide 22

Приложения для OpenXML Office – 2000, XP, 2003, 2007 iWork ’08 http://www.apple.com/iwork/ Apple iPhone Corel PerfectOffice Novell version OpenOffice.org, Gnumeric

Slide 23

Дополнительно www.microsoft.com/office/preview www.OpenXMLDeveloper.org www.Ecmainternational.org Blogs.msdn.com/brian_jones msdn.microsoft.com/office/xml www.microsoft.com/technet/prodtechnol/office www.microsoft.com/resources/casestudies

Slide 24

Copyright ©2006 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

Summary: Владимир Габриель, Платформа 2008, https://platforma2008.ru

Tags: service oriented architecture platform 2008 conference

URL: