Como ler documentos do BrOffice por scripts no Scribus

O Scribus importa documentos de texto do BrOffice / LibreOffice / OpenOffice.org, mas apenas manualmente, pelo menu “Arquivo > Abrir”. Se quiser importar por script, em alguma tarefa automatizada, tenho que usar o módulo Python apropriado.

Este módulo é o OOoPy, cujo projeto reside em http://sourceforge.net/projects/ooopy/.

[N. do autor: artigo em expansão]

Para baixar e instalar o módulo OOoPy no Python de meu Ubuntu, abri o terminal e usei os seguintes comandos:

wget http://ufpr.dl.sourceforge.net/project/ooopy/ooopy/1.6.7680/OOoPy-1.6.7680.tar.gz
tar -vzxf ./OOoPy-1.6.7680.tar.gz
cd ./OOoPy-1.6.7680
sudo python setup.py install

Devo ajustar os números da versão mais nova, se tentar isto futuramente.

O resultado da instalação, no Ubuntu, deve ser algo como:

running install
running build
running build_py
creating build
creating build/lib.linux-i686-2.6
creating build/lib.linux-i686-2.6/ooopy
copying ooopy/Transformer.py -> build/lib.linux-i686-2.6/ooopy
copying ooopy/OOoPy.py -> build/lib.linux-i686-2.6/ooopy
copying ooopy/Version.py -> build/lib.linux-i686-2.6/ooopy
copying ooopy/Transforms.py -> build/lib.linux-i686-2.6/ooopy
copying ooopy/__init__.py -> build/lib.linux-i686-2.6/ooopy
running build_scripts
creating build/scripts-2.6
copying and adjusting ooo_as_text -> build/scripts-2.6
copying and adjusting ooo_cat -> build/scripts-2.6
copying and adjusting ooo_fieldreplace -> build/scripts-2.6
copying ooo_grep -> build/scripts-2.6
copying and adjusting ooo_mailmerge -> build/scripts-2.6
changing mode of build/scripts-2.6/ooo_as_text from 644 to 755
changing mode of build/scripts-2.6/ooo_cat from 644 to 755
changing mode of build/scripts-2.6/ooo_fieldreplace from 644 to 755
changing mode of build/scripts-2.6/ooo_mailmerge from 644 to 755
running install_lib
creating /usr/local/lib/python2.6/dist-packages/ooopy
copying build/lib.linux-i686-2.6/ooopy/Transformer.py -> /usr/local/lib/python2.6/dist-packages/ooopy
copying build/lib.linux-i686-2.6/ooopy/OOoPy.py -> /usr/local/lib/python2.6/dist-packages/ooopy
copying build/lib.linux-i686-2.6/ooopy/Version.py -> /usr/local/lib/python2.6/dist-packages/ooopy
copying build/lib.linux-i686-2.6/ooopy/Transforms.py -> /usr/local/lib/python2.6/dist-packages/ooopy
copying build/lib.linux-i686-2.6/ooopy/__init__.py -> /usr/local/lib/python2.6/dist-packages/ooopy
byte-compiling /usr/local/lib/python2.6/dist-packages/ooopy/Transformer.py to Transformer.pyc
byte-compiling /usr/local/lib/python2.6/dist-packages/ooopy/OOoPy.py to OOoPy.pyc
byte-compiling /usr/local/lib/python2.6/dist-packages/ooopy/Version.py to Version.pyc
byte-compiling /usr/local/lib/python2.6/dist-packages/ooopy/Transforms.py to Transforms.pyc
byte-compiling /usr/local/lib/python2.6/dist-packages/ooopy/__init__.py to __init__.pyc
running install_scripts
copying build/scripts-2.6/ooo_cat -> /usr/local/bin
copying build/scripts-2.6/ooo_mailmerge -> /usr/local/bin
copying build/scripts-2.6/ooo_fieldreplace -> /usr/local/bin
copying build/scripts-2.6/ooo_as_text -> /usr/local/bin
copying build/scripts-2.6/ooo_grep -> /usr/local/bin
changing mode of /usr/local/bin/ooo_cat to 755
changing mode of /usr/local/bin/ooo_mailmerge to 755
changing mode of /usr/local/bin/ooo_fieldreplace to 755
changing mode of /usr/local/bin/ooo_as_text to 755
changing mode of /usr/local/bin/ooo_grep to 755
running install_data
creating /usr/local/share/ooopy
copying test.sxw -> /usr/local/share/ooopy
copying carta.stw -> /usr/local/share/ooopy
copying test.odt -> /usr/local/share/ooopy
copying carta.odt -> /usr/local/share/ooopy
copying rechng.sxw -> /usr/local/share/ooopy
copying rechng.odt -> /usr/local/share/ooopy
copying run_doctest.py -> /usr/local/share/ooopy
copying x.csv -> /usr/local/share/ooopy
running install_egg_info
Writing /usr/local/lib/python2.6/dist-packages/OOoPy-1.6.7680.egg-info

Para baixar e instalar no Windows, é mais complicado. Como eu tenho o baixador wget e o compactador 7z, a sequência de comandos que usei foi:

wget http://ufpr.dl.sourceforge.net/project/ooopy/ooopy/1.6.7680/OOoPy-1.6.7680.tar.gz
"%PROGRAMFILES%"\7-Zip\7z.exe x OOoPy-1.6.7680.tar.gz
"%PROGRAMFILES%"\7-Zip\7z.exe x -o"%PROGRAMFILES%"\"Scribus 1.3.9"\lib\site-packages\ OOoPy-1.6.7680.tar
del OOoPy-1.6.7680.tar

Executei o arquivo setup.py pelo Scribus Scripter (menu “Scrip > Executar script..”, procurei o arquivo em %PROGRAMFILES%”\”Scribus 1.3.9″\lib\site-packages\OOoPy-1.6.7680\setup.py).

A seguir, copiei a pasta %PROGRAMFILES%”\”Scribus 1.3.9″\lib\site-packages\OOoPy-1.6.7680\ooopy\ para a pasta %PROGRAMFILES%”\”Scribus 1.3.9″\lib\site-packages. Este método é meio gambiarra, pois não cria o arquivo egg-info padrão criado no Linux.

Outra opção é instalar o módulo no Python padrão e copiar os arquivos instalados para o Python do Scribus:

c:\>cd "c:\Python26\Lib\site-packages\OOoPy-1.6.7680\"

C:\Python26\Lib\site-packages\OOoPy-1.6.7680>c:\Python26\python.exe setup.py install
running install
running build
running build_py
creating build
creating build\lib
creating build\lib\ooopy
copying ooopy\OOoPy.py -> build\lib\ooopy
copying ooopy\Transformer.py -> build\lib\ooopy
copying ooopy\Transforms.py -> build\lib\ooopy
copying ooopy\Version.py -> build\lib\ooopy
copying ooopy\__init__.py -> build\lib\ooopy
running build_scripts
creating build\scripts-2.6
copying and adjusting ooo_as_text -> build\scripts-2.6
copying and adjusting ooo_cat -> build\scripts-2.6
copying and adjusting ooo_fieldreplace -> build\scripts-2.6
copying ooo_grep -> build\scripts-2.6
copying and adjusting ooo_mailmerge -> build\scripts-2.6
running install_lib
creating c:\Python26\Lib\site-packages\ooopy
copying build\lib\ooopy\OOoPy.py -> c:\Python26\Lib\site-packages\ooopy
copying build\lib\ooopy\Transformer.py -> c:\Python26\Lib\site-packages\ooopy
copying build\lib\ooopy\Transforms.py -> c:\Python26\Lib\site-packages\ooopy
copying build\lib\ooopy\Version.py -> c:\Python26\Lib\site-packages\ooopy
copying build\lib\ooopy\__init__.py -> c:\Python26\Lib\site-packages\ooopy
byte-compiling c:\Python26\Lib\site-packages\ooopy\OOoPy.py to OOoPy.pyc
byte-compiling c:\Python26\Lib\site-packages\ooopy\Transformer.py to Transformer.pyc
byte-compiling c:\Python26\Lib\site-packages\ooopy\Transforms.py to Transforms.pyc
byte-compiling c:\Python26\Lib\site-packages\ooopy\Version.py to Version.pyc
byte-compiling c:\Python26\Lib\site-packages\ooopy\__init__.py to __init__.pyc
running install_scripts
copying build\scripts-2.6\ooo_as_text -> c:\Python26\Scripts
copying build\scripts-2.6\ooo_cat -> c:\Python26\Scripts
copying build\scripts-2.6\ooo_fieldreplace -> c:\Python26\Scripts
copying build\scripts-2.6\ooo_grep -> c:\Python26\Scripts
copying build\scripts-2.6\ooo_mailmerge -> c:\Python26\Scripts
running install_data
creating c:\Python26\share
creating c:\Python26\share\ooopy
copying test.sxw -> c:\Python26\share\ooopy
copying carta.stw -> c:\Python26\share\ooopy
copying test.odt -> c:\Python26\share\ooopy
copying carta.odt -> c:\Python26\share\ooopy
copying rechng.sxw -> c:\Python26\share\ooopy
copying rechng.odt -> c:\Python26\share\ooopy
copying run_doctest.py -> c:\Python26\share\ooopy
copying x.csv -> c:\Python26\share\ooopy
running install_egg_info
Writing c:\Python26\Lib\site-packages\OOoPy-1.6.7680-py2.6.egg-info

 

Para saber como funciona o módulo OOoPy, usei, pelo console do Scripter Scribus, os comandos:

from ooopy.OOoPy import OOoPy
help (OOoPy)
from ooopy.Transformer import Transformer
help (Transformer)

No Ubuntu 10, tenho que inserir os paths antes de usar este código, devido a um bug do próprio Ubuntu:

sys.path.insert(0,'/usr/lib/python2.6/')
sys.path.insert(0,'')

Help do OOoPy

Help on class OOoPy in module ooopy.OOoPy:

class OOoPy(autosuper)
|  Wrapper for OpenOffice.org zip files (all OOo documents are
|  really zip files internally).
|
|  from ooopy.OOoPy import OOoPy
|  >>> o = OOoPy (infile = ‘test.sxw’, outfile = ‘out.sxw’)
|  >>> o.mimetype
|  ‘application/vnd.sun.xml.writer’
|  >>> for f in files :
|  …     e = o.read (f)
|  …     e.write ()
|  …
|  >>> o.close ()
|  >>> o = OOoPy (infile = ‘test.odt’, outfile = ‘out2.odt’)
|  >>> o.mimetype
|  ‘application/vnd.oasis.opendocument.text’
|  >>> for f in files :
|  …     e = o.read (f)
|  …     e.write ()
|  …
|  >>> o.close ()
|  >>> o = OOoPy (infile = ‘out2.odt’)
|  >>> for f in o.izip.infolist () :
|  …     print f.filename, f.create_system
|  mimetype 0
|  content.xml 0
|  styles.xml 0
|  meta.xml 0
|  settings.xml 0
|  META-INF/manifest.xml 0
|  Configurations2/statusbar/ 0
|  Configurations2/accelerator/current.xml 0
|  Configurations2/floater/ 0
|  Configurations2/popupmenu/ 0
|  Configurations2/progressbar/ 0
|  Configurations2/menubar/ 0
|  Configurations2/toolbar/ 0
|  Configurations2/images/Bitmaps/ 0
|  Thumbnails/thumbnail.png 0
|  >>> for f in o.izip.infolist () :
|  …     print f.filename, f.compress_type, f.compress_size, f.file_size
|  mimetype 8 41 39
|  content.xml 8 1930 16212
|  styles.xml 8 1888 12743
|  meta.xml 8 436 1545
|  settings.xml 8 1376 7862
|  META-INF/manifest.xml 8 286 1845
|  Configurations2/statusbar/ 0 0 0
|  Configurations2/accelerator/current.xml 8 2 0
|  Configurations2/floater/ 0 0 0
|  Configurations2/popupmenu/ 0 0 0
|  Configurations2/progressbar/ 0 0 0
|  Configurations2/menubar/ 0 0 0
|  Configurations2/toolbar/ 0 0 0
|  Configurations2/images/Bitmaps/ 0 0 0
|  Thumbnails/thumbnail.png 8 2145 2367
|
|  Method resolution order:
|      OOoPy
|      autosuper
|      __builtin__.object
|
|  Methods defined here:
|
|  __del__ = close(self)
|
|  __init__(self, infile=None, outfile=None, write_mode=’w’, mimetype=None)
|      Open an OOo document, if no outfile is given, we open the
|      file read-only. Otherwise the outfile has to be different
|      from the infile — the python ZipFile can’t deal with
|      read-write access. In case an outfile is given, we open it
|      in “w” mode as a zip file, unless write_mode is specified
|      (the only allowed case would be “a” for appending to an
|      existing file, see pythons ZipFile documentation for
|      details). If no infile is given, the user is responsible for
|      providing all necessary files in the resulting output file.
|
|      It seems that OOo needs to have the mimetype as the first
|      archive member (at least with mimetype as the first member
|      it works, the order may not be arbitrary) to recognize a zip
|      archive as an OOo file. When copying from a given infile, we
|      use the same order of elements in the resulting output. When
|      creating new elements we make sure the mimetype is the first
|      in the resulting archive.
|
|      Note that both, infile and outfile can either be filenames
|      or file-like objects (e.g. StringIO).
|
|      The mimetype is automatically determined if an infile is
|      given. If only writing is desired, the mimetype should be
|      set.
|
|  close(self)
|      Close the zip files. According to documentation of zipfile in
|      the standard python lib, this has to be done to be sure
|      everything is written. We copy over the not-yet written files
|      from izip before closing ozip.
|
|  read(self, zname)
|      return an OOoElementTree object for the given OOo document
|      archive member name. Currently an OOo document contains the
|      following XML files::
|
|       * content.xml: the text of the OOo document
|       * styles.xml: style definitions
|       * meta.xml: meta-information (author, last changed, …)
|       * settings.xml: settings in OOo
|       * META-INF/manifest.xml: contents of the archive
|
|      There is an additional file “mimetype” that always contains
|      the string “application/vnd.sun.xml.writer” for OOo 1.X files
|      and the string “application/vnd.oasis.opendocument.text” for
|      OOo 2.X files.
|
|  write(self, zname, etree)
|
|  ———————————————————————-
|  Data descriptors inherited from autosuper:
|
|  __dict__
|      dictionary for instance variables (if defined)
|
|  __weakref__
|      list of weak references to the object (if defined)
|
|  ———————————————————————-
|  Data and other attributes inherited from autosuper:
|
|  __metaclass__ = <class ‘ooopy.OOoPy._autosuper’>

Help do Transformer

Help on class Transformer in module ooopy.Transformer:

class Transformer(ooopy.OOoPy.autosuper)
|  Class for applying a set of transforms to a given ooopy object.
|  The transforms are applied to the specified file in priority
|  order. When applying transforms we have a mechanism for
|  communication of transforms. We give the transformer to the
|  individual transforms as a parameter. The transforms may use the
|  transformer like a dictionary for storing values and retrieving
|  values left by previous transforms.
|  As a naming convention each transform should use its class name
|  as a prefix for storing values in the dictionary.
|  >>> import Transforms
|  >>> from Transforms import renumber_all, get_meta, set_meta, meta_counts
|  >>> from StringIO import StringIO
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘test.sxw’, outfile = sio)
|  >>> m   = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> body [-1].get (OOo_Tag (‘text’, ‘style-name’, mimetype = m))
|  ‘Standard’
|  >>> def cb (name) :
|  …     r = { ‘street’     : ‘Beispielstrasse 42’
|  …         , ‘firstname’  : ‘Hugo’
|  …         , ‘salutation’ : ‘Frau’
|  …         }
|  …     if r.has_key (name) : return r [name]
|  …     return None
|  …
|  >>> p = get_meta (m)
|  >>> t = Transformer (m, p)
|  >>> t [‘a’] = ‘a’
|  >>> t [‘a’]
|  ‘a’
|  >>> t.transform (o)
|  >>> p.set (‘a’, ‘b’)
|  >>> t [‘Attribute_Access:a’]
|  ‘b’
|  >>> t   = Transformer (
|  …       m
|  …     , Transforms.Autoupdate ()
|  …     , Transforms.Editinfo   ()
|  …     , Transforms.Field_Replace (prio = 99, replace = cb)
|  …     , Transforms.Field_Replace
|  …         ( replace =
|  …             { ‘salutation’ : ”
|  …             , ‘firstname’  : ‘Erika’
|  …             , ‘lastname’   : ‘Musterfrau’
|  …             , ‘country’    : ‘D’
|  …             , ‘postalcode’ : ‘00815’
|  …             , ‘city’       : ‘Niemandsdorf’
|  …             }
|  …         )
|  …     , Transforms.Addpagebreak_Style ()
|  …     , Transforms.Addpagebreak       ()
|  …     )
|  >>> t.transform (o)
|  >>> o.close ()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“testout.sxw”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> c = o.read (‘content.xml’)
|  >>> m = o.mimetype
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, mimetype = m)
|  >>> for node in body.findall (vset) :
|  …     name = node.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name, ‘:’, node.text
|  salutation : None
|  firstname : Erika
|  lastname : Musterfrau
|  street : Beispielstrasse 42
|  country : D
|  postalcode : 00815
|  city : Niemandsdorf
|  salutation : None
|  firstname : Erika
|  lastname : Musterfrau
|  street : Beispielstrasse 42
|  country : D
|  postalcode : 00815
|  city : Niemandsdorf
|  >>> body [-1].get (OOo_Tag (‘text’, ‘style-name’, mimetype = m))
|  ‘P2’
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘test.sxw’, outfile = sio)
|  >>> c = o.read (‘content.xml’)
|  >>> t   = Transformer (
|  …       o.mimetype
|  …     , get_meta (o.mimetype)
|  …     , Transforms.Addpagebreak_Style ()
|  …     , Transforms.Mailmerge
|  …       ( iterator =
|  …         ( dict (firstname = ‘Erika’, lastname = ‘Nobody’)
|  …         , dict (firstname = ‘Eric’,  lastname = ‘Wizard’)
|  …         , cb
|  …         )
|  …       )
|  …     , renumber_all (o.mimetype)
|  …     , set_meta (o.mimetype)
|  …     , Transforms.Fix_OOo_Tag ()
|  …     )
|  >>> t.transform (o)
|  >>> for i in meta_counts :
|  …     print i, t [‘:’.join ((‘Set_Attribute’, i))]
|  character-count 951
|  image-count 0
|  object-count 0
|  page-count 3
|  paragraph-count 113
|  table-count 3
|  word-count 162
|  >>> name = t [‘Addpagebreak_Style:stylename’]
|  >>> name
|  ‘P2’
|  >>> o.close ()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“testout2.sxw”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, m))
|  >>> for n in body.findall (‘.//*’) :
|  …     zidx = n.get (OOo_Tag (‘draw’, ‘z-index’, m))
|  …     if zidx :
|  …         print ‘:’.join(split_tag (n.tag)), zidx
|  draw:text-box 0
|  draw:rect 1
|  draw:text-box 3
|  draw:rect 4
|  draw:text-box 6
|  draw:rect 7
|  draw:text-box 2
|  draw:text-box 5
|  draw:text-box 8
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘text’, ‘p’, m)) :
|  …     if n.get (OOo_Tag (‘text’, ‘style-name’, m)) == name :
|  …         print n.tag
|  {http://openoffice.org/2000/text}p
|  {http://openoffice.org/2000/text}p
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, m)
|  >>> for n in body.findall (vset) :
|  …     if n.get (OOo_Tag (‘text’, ‘name’, m), None).endswith (‘name’) :
|  …         name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …         print name, ‘:’, n.text
|  firstname : Erika
|  lastname : Nobody
|  firstname : Eric
|  lastname : Wizard
|  firstname : Hugo
|  lastname : Testman
|  firstname : Erika
|  lastname : Nobody
|  firstname : Eric
|  lastname : Wizard
|  firstname : Hugo
|  lastname : Testman
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘draw’, ‘text-box’, m)) :
|  …     print n.get (OOo_Tag (‘draw’, ‘name’, m)),
|  …     print n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m))
|  Frame1 1
|  Frame2 2
|  Frame3 3
|  Frame4 None
|  Frame5 None
|  Frame6 None
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘text’, ‘section’, m)) :
|  …     print n.get (OOo_Tag (‘text’, ‘name’, m))
|  Section1
|  Section2
|  Section3
|  Section4
|  Section5
|  Section6
|  Section7
|  Section8
|  Section9
|  Section10
|  Section11
|  Section12
|  Section13
|  Section14
|  Section15
|  Section16
|  Section17
|  Section18
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘table’, ‘table’, m)) :
|  …     print n.get (OOo_Tag (‘table’, ‘name’, m))
|  Table1
|  Table2
|  Table3
|  >>> r = o.read (‘meta.xml’)
|  >>> meta = r.find (‘.//’ + OOo_Tag (‘meta’, ‘document-statistic’, m))
|  >>> for i in meta_counts :
|  …     print i, repr (meta.get (OOo_Tag (‘meta’, i, m)))
|  character-count ‘951’
|  image-count ‘0’
|  object-count ‘0’
|  page-count ‘3’
|  paragraph-count ‘113’
|  table-count ‘3’
|  word-count ‘162’
|  >>> o.close ()
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘test.sxw’, outfile = sio)
|  >>> t   = Transformer (
|  …       o.mimetype
|  …     , get_meta (o.mimetype)
|  …     , Transforms.Concatenate (‘test.sxw’, ‘rechng.sxw’)
|  …     , renumber_all (o.mimetype)
|  …     , set_meta (o.mimetype)
|  …     , Transforms.Fix_OOo_Tag ()
|  …     )
|  >>> t.transform (o)
|  >>> for i in meta_counts :
|  …     print i, repr (t [‘:’.join ((‘Set_Attribute’, i))])
|  character-count ‘1131’
|  image-count ‘0’
|  object-count ‘0’
|  page-count ‘3’
|  paragraph-count ‘168’
|  table-count ‘2’
|  word-count ‘160’
|  >>> o.close ()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“testout3.sxw”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> s = o.read (‘styles.xml’)
|  >>> for n in c.findall (‘./*/*’) :
|  …     name = n.get (OOo_Tag (‘style’, ‘name’, m))
|  …     if name :
|  …         parent = n.get (OOo_Tag (‘style’, ‘parent-style-name’, m))
|  …         print ‘”%s”, “%s”‘ % (name, parent)
|  “Tahoma1”, “None”
|  “Bitstream Vera Sans”, “None”
|  “Tahoma”, “None”
|  “Nimbus Roman No9 L”, “None”
|  “Courier New”, “None”
|  “Arial Black”, “None”
|  “New Century Schoolbook”, “None”
|  “Helvetica”, “None”
|  “Table1”, “None”
|  “Table1.A”, “None”
|  “Table1.A1”, “None”
|  “Table1.E1”, “None”
|  “Table1.A2”, “None”
|  “Table1.E2”, “None”
|  “P1”, “None”
|  “fr1”, “Frame”
|  “fr2”, “None”
|  “fr3”, “Frame”
|  “Sect1”, “None”
|  “gr1”, “None”
|  “P2”, “Standard”
|  “Standard_Concat”, “None”
|  “Concat_P1”, “Concat_Frame contents”
|  “Concat_P2”, “Concat_Frame contents”
|  “P3”, “Concat_Frame contents”
|  “P4”, “Concat_Frame contents”
|  “P5”, “Concat_Standard”
|  “P6”, “Concat_Standard”
|  “P7”, “Concat_Frame contents”
|  “P8”, “Concat_Frame contents”
|  “P9”, “Concat_Frame contents”
|  “P10”, “Concat_Frame contents”
|  “P11”, “Concat_Frame contents”
|  “P12”, “Concat_Frame contents”
|  “P13”, “Concat_Frame contents”
|  “P15”, “Concat_Standard”
|  “P16”, “Concat_Standard”
|  “P17”, “Concat_Standard”
|  “P18”, “Concat_Standard”
|  “P19”, “Concat_Standard”
|  “P20”, “Concat_Standard”
|  “P21”, “Concat_Standard”
|  “P22”, “Concat_Standard”
|  “P23”, “Concat_Standard”
|  “T1”, “None”
|  “Concat_fr1”, “Concat_Frame”
|  “Concat_fr2”, “Concat_Frame”
|  “Concat_fr3”, “Concat_Frame”
|  “fr4”, “Concat_Frame”
|  “fr5”, “Concat_Frame”
|  “fr6”, “Concat_Frame”
|  “Concat_Sect1”, “None”
|  “N0”, “None”
|  “N2”, “None”
|  “P15_Concat”, “Concat_Standard”
|  >>> for n in s.findall (‘./*/*’) :
|  …     name = n.get (OOo_Tag (‘style’, ‘name’, m))
|  …     if name :
|  …         parent = n.get (OOo_Tag (‘style’, ‘parent-style-name’, m))
|  …         print ‘”%s”, “%s”‘ % (name, parent)
|  “Tahoma1”, “None”
|  “Bitstream Vera Sans”, “None”
|  “Tahoma”, “None”
|  “Nimbus Roman No9 L”, “None”
|  “Courier New”, “None”
|  “Arial Black”, “None”
|  “New Century Schoolbook”, “None”
|  “Helvetica”, “None”
|  “Standard”, “None”
|  “Text body”, “Standard”
|  “List”, “Text body”
|  “Table Contents”, “Text body”
|  “Table Heading”, “Table Contents”
|  “Caption”, “Standard”
|  “Frame contents”, “Text body”
|  “Index”, “Standard”
|  “Frame”, “None”
|  “OLE”, “None”
|  “Concat_Standard”, “None”
|  “Concat_Text body”, “Concat_Standard”
|  “Concat_List”, “Concat_Text body”
|  “Concat_Caption”, “Concat_Standard”
|  “Concat_Frame contents”, “Concat_Text body”
|  “Concat_Index”, “Concat_Standard”
|  “Horizontal Line”, “Concat_Standard”
|  “Internet link”, “None”
|  “Visited Internet Link”, “None”
|  “Concat_Frame”, “None”
|  “Concat_OLE”, “None”
|  “pm1”, “None”
|  “Concat_pm1”, “None”
|  “Standard”, “None”
|  “Concat_Standard”, “None”
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘variable-decl’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name
|  salutation
|  firstname
|  lastname
|  street
|  country
|  postalcode
|  city
|  date
|  invoice.invoice_no
|  invoice.abo.aboprice.abotype.description
|  address.salutation
|  address.title
|  address.firstname
|  address.lastname
|  address.function
|  address.street
|  address.country
|  address.postalcode
|  address.city
|  invoice.subscriber.salutation
|  invoice.subscriber.title
|  invoice.subscriber.firstname
|  invoice.subscriber.lastname
|  invoice.subscriber.function
|  invoice.subscriber.street
|  invoice.subscriber.country
|  invoice.subscriber.postalcode
|  invoice.subscriber.city
|  invoice.period_start
|  invoice.period_end
|  invoice.currency.name
|  invoice.amount
|  invoice.subscriber.initial
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘sequence-decl’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name
|  Illustration
|  Table
|  Text
|  Drawing
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘p’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘style-name’, m))
|  …     if not name or name.startswith (‘Concat’) :
|  …         print “>%s<” % name
|  >Concat_P1<
|  >Concat_P2<
|  >Concat_Frame contents<
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘text-box’, m)) :
|  …     attrs = ‘name’, ‘style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     attrs.append (n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m)))
|  …     print attrs
|  [‘Frame1’, ‘fr1’, ‘0’, ‘1’]
|  [‘Frame2’, ‘fr1’, ‘3’, ‘2’]
|  [‘Frame3’, ‘Concat_fr1’, ‘6’, ‘3’]
|  [‘Frame4’, ‘Concat_fr2’, ‘7’, ‘3’]
|  [‘Frame5’, ‘Concat_fr3’, ‘8’, ‘3’]
|  [‘Frame6’, ‘Concat_fr1’, ‘9’, ‘3’]
|  [‘Frame7’, ‘fr4′, ’10’, ‘3’]
|  [‘Frame8’, ‘fr4′, ’11’, ‘3’]
|  [‘Frame9’, ‘fr4′, ’12’, ‘3’]
|  [‘Frame10’, ‘fr4′, ’13’, ‘3’]
|  [‘Frame11’, ‘fr4′, ’14’, ‘3’]
|  [‘Frame12’, ‘fr4′, ’15’, ‘3’]
|  [‘Frame13’, ‘fr5′, ’16’, ‘3’]
|  [‘Frame14’, ‘fr4′, ’18’, ‘3’]
|  [‘Frame15’, ‘fr4′, ’19’, ‘3’]
|  [‘Frame16’, ‘fr4′, ’20’, ‘3’]
|  [‘Frame17’, ‘fr6′, ’17’, ‘3’]
|  [‘Frame18’, ‘fr4′, ’23’, ‘3’]
|  [‘Frame19’, ‘fr3’, ‘2’, None]
|  [‘Frame20’, ‘fr3’, ‘5’, None]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘section’, m)) :
|  …     attrs = ‘name’, ‘style-name’
|  …     attrs = [n.get (OOo_Tag (‘text’, i, m)) for i in attrs]
|  …     print attrs
|  [‘Section1’, ‘Sect1’]
|  [‘Section2’, ‘Sect1’]
|  [‘Section3’, ‘Sect1’]
|  [‘Section4’, ‘Sect1’]
|  [‘Section5’, ‘Sect1’]
|  [‘Section6’, ‘Sect1’]
|  [‘Section7’, ‘Concat_Sect1’]
|  [‘Section8’, ‘Concat_Sect1’]
|  [‘Section9’, ‘Concat_Sect1’]
|  [‘Section10’, ‘Concat_Sect1’]
|  [‘Section11’, ‘Concat_Sect1’]
|  [‘Section12’, ‘Concat_Sect1’]
|  [‘Section13’, ‘Concat_Sect1’]
|  [‘Section14’, ‘Concat_Sect1’]
|  [‘Section15’, ‘Concat_Sect1’]
|  [‘Section16’, ‘Concat_Sect1’]
|  [‘Section17’, ‘Concat_Sect1’]
|  [‘Section18’, ‘Concat_Sect1’]
|  [‘Section19’, ‘Concat_Sect1’]
|  [‘Section20’, ‘Concat_Sect1’]
|  [‘Section21’, ‘Concat_Sect1’]
|  [‘Section22’, ‘Concat_Sect1’]
|  [‘Section23’, ‘Concat_Sect1’]
|  [‘Section24’, ‘Concat_Sect1’]
|  [‘Section25’, ‘Concat_Sect1’]
|  [‘Section26’, ‘Concat_Sect1’]
|  [‘Section27’, ‘Concat_Sect1’]
|  [‘Section28’, ‘Sect1’]
|  [‘Section29’, ‘Sect1’]
|  [‘Section30’, ‘Sect1’]
|  [‘Section31’, ‘Sect1’]
|  [‘Section32’, ‘Sect1’]
|  [‘Section33’, ‘Sect1’]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘rect’, m)) :
|  …     attrs = ‘style-name’, ‘text-style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     attrs.append (n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m)))
|  …     print attrs
|  [‘gr1’, ‘P1’, ‘1’, ‘1’]
|  [‘gr1’, ‘P1’, ‘4’, ‘2’]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘line’, m)) :
|  …     attrs = ‘style-name’, ‘text-style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     print attrs
|  [‘gr1’, ‘P1′, ’24’]
|  [‘gr1’, ‘P1′, ’22’]
|  [‘gr1’, ‘P1′, ’21’]
|  >>> for n in s.findall (‘.//’ + OOo_Tag (‘style’, ‘style’, m)) :
|  …     if n.get (OOo_Tag (‘style’, ‘name’, m)).startswith (‘Co’) :
|  …         attrs = ‘name’, ‘class’, ‘family’
|  …         attrs = [n.get (OOo_Tag (‘style’, i, m)) for i in attrs]
|  …         print attrs
|  …         props = n.find (‘./’ + OOo_Tag (‘style’, ‘properties’, m))
|  …         if props is not None and len (props) :
|  …             props [0].tag
|  [‘Concat_Standard’, ‘text’, ‘paragraph’]
|  ‘{http://openoffice.org/2000/style}tab-stops’
|  [‘Concat_Text body’, ‘text’, ‘paragraph’]
|  [‘Concat_List’, ‘list’, ‘paragraph’]
|  [‘Concat_Caption’, ‘extra’, ‘paragraph’]
|  [‘Concat_Frame contents’, ‘extra’, ‘paragraph’]
|  [‘Concat_Index’, ‘index’, ‘paragraph’]
|  [‘Concat_Frame’, None, ‘graphics’]
|  [‘Concat_OLE’, None, ‘graphics’]
|  >>> for n in c.findall (‘.//*’) :
|  …     zidx = n.get (OOo_Tag (‘draw’, ‘z-index’, m))
|  …     if zidx :
|  …         print ‘:’.join(split_tag (n.tag)), zidx
|  draw:text-box 0
|  draw:rect 1
|  draw:text-box 3
|  draw:rect 4
|  draw:text-box 6
|  draw:text-box 7
|  draw:text-box 8
|  draw:text-box 9
|  draw:text-box 10
|  draw:text-box 11
|  draw:text-box 12
|  draw:text-box 13
|  draw:text-box 14
|  draw:text-box 15
|  draw:text-box 16
|  draw:text-box 18
|  draw:text-box 19
|  draw:text-box 20
|  draw:text-box 17
|  draw:text-box 23
|  draw:line 24
|  draw:text-box 2
|  draw:text-box 5
|  draw:line 22
|  draw:line 21
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘carta.stw’, outfile = sio)
|  >>> t = Transformer (
|  …     o.mimetype
|  …   , get_meta (o.mimetype)
|  …   , Transforms.Addpagebreak_Style ()
|  …   , Transforms.Mailmerge
|  …     ( iterator =
|  …         ( dict
|  …             ( Spett = “Spettabile”
|  …             , contraente = “First person”
|  …             , indirizzo = “street? 1”
|  …             , tipo = “racc. A.C.”
|  …             , luogo = “Varese”
|  …             , oggetto = “Saluti”
|  …             )
|  …         , dict
|  …             ( Spett = “Egregio”
|  …             , contraente = “Second Person”
|  …             , indirizzo = “street? 2”
|  …             , tipo = “Raccomandata”
|  …             , luogo = “Gavirate”
|  …             , oggetto = “Ossequi”
|  …             )
|  …         )
|  …     )
|  …   , renumber_all (o.mimetype)
|  …   , set_meta (o.mimetype)
|  …   , Transforms.Fix_OOo_Tag ()
|  …   )
|  >>> t.transform(o)
|  >>> o.close()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“carta-out.stw”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, mimetype = m)
|  >>> for node in body.findall (vset) :
|  …     name = node.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name, ‘:’, node.text
|  Spett : Spettabile
|  contraente : First person
|  indirizzo : street? 1
|  Spett : Egregio
|  contraente : Second Person
|  indirizzo : street? 2
|  tipo : racc. A.C.
|  luogo : Varese
|  oggetto : Saluti
|  tipo : Raccomandata
|  luogo : Gavirate
|  oggetto : Ossequi
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘test.odt’, outfile = sio)
|  >>> t   = Transformer (
|  …       o.mimetype
|  …     , get_meta (o.mimetype)
|  …     , Transforms.Addpagebreak_Style ()
|  …     , Transforms.Mailmerge
|  …       ( iterator =
|  …         ( dict (firstname = ‘Erika’, lastname = ‘Nobody’)
|  …         , dict (firstname = ‘Eric’,  lastname = ‘Wizard’)
|  …         , cb
|  …         )
|  …       )
|  …     , renumber_all (o.mimetype)
|  …     , set_meta (o.mimetype)
|  …     , Transforms.Fix_OOo_Tag ()
|  …     )
|  >>> t.transform (o)
|  >>> for i in meta_counts :
|  …     print i, t [‘:’.join ((‘Set_Attribute’, i))]
|  character-count 951
|  image-count 0
|  object-count 0
|  page-count 3
|  paragraph-count 53
|  table-count 3
|  word-count 162
|  >>> name = t [‘Addpagebreak_Style:stylename’]
|  >>> name
|  ‘P2’
|  >>> o.close ()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“testout.odt”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, m))
|  >>> for n in body.findall (‘.//*’) :
|  …     zidx = n.get (OOo_Tag (‘draw’, ‘z-index’, m))
|  …     if zidx :
|  …         print ‘:’.join(split_tag (n.tag)), zidx
|  draw:frame 0
|  draw:rect 1
|  draw:frame 3
|  draw:rect 4
|  draw:frame 6
|  draw:rect 7
|  draw:frame 2
|  draw:frame 5
|  draw:frame 8
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘text’, ‘p’, m)) :
|  …     if n.get (OOo_Tag (‘text’, ‘style-name’, m)) == name :
|  …         print n.tag
|  {urn:oasis:names:tc:opendocument:xmlns:text:1.0}p
|  {urn:oasis:names:tc:opendocument:xmlns:text:1.0}p
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, m)
|  >>> for n in body.findall (vset) :
|  …     if n.get (OOo_Tag (‘text’, ‘name’, m), None).endswith (‘name’) :
|  …         name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …         print name, ‘:’, n.text
|  firstname : Erika
|  lastname : Nobody
|  firstname : Eric
|  lastname : Wizard
|  firstname : Hugo
|  lastname : Testman
|  firstname : Erika
|  lastname : Nobody
|  firstname : Eric
|  lastname : Wizard
|  firstname : Hugo
|  lastname : Testman
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘draw’, ‘frame’, m)) :
|  …     print n.get (OOo_Tag (‘draw’, ‘name’, m)),
|  …     print n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m))
|  Frame1 1
|  Frame2 2
|  Frame3 3
|  Frame4 None
|  Frame5 None
|  Frame6 None
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘text’, ‘section’, m)) :
|  …     print n.get (OOo_Tag (‘text’, ‘name’, m))
|  Section1
|  Section2
|  Section3
|  Section4
|  Section5
|  Section6
|  Section7
|  Section8
|  Section9
|  Section10
|  Section11
|  Section12
|  Section13
|  Section14
|  Section15
|  Section16
|  Section17
|  Section18
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘table’, ‘table’, m)) :
|  …     print n.get (OOo_Tag (‘table’, ‘name’, m))
|  Table1
|  Table2
|  Table3
|  >>> r = o.read (‘meta.xml’)
|  >>> meta = r.find (‘.//’ + OOo_Tag (‘meta’, ‘document-statistic’, m))
|  >>> for i in meta_counts :
|  …     print i, repr (meta.get (OOo_Tag (‘meta’, i, m)))
|  character-count ‘951’
|  image-count ‘0’
|  object-count ‘0’
|  page-count ‘3’
|  paragraph-count ’53’
|  table-count ‘3’
|  word-count ‘162’
|  >>> o.close ()
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘carta.odt’, outfile = sio)
|  >>> t = Transformer (
|  …     o.mimetype
|  …   , get_meta (o.mimetype)
|  …   , Transforms.Addpagebreak_Style ()
|  …   , Transforms.Mailmerge
|  …     ( iterator =
|  …         ( dict
|  …             ( Spett = “Spettabile”
|  …             , contraente = “First person”
|  …             , indirizzo = “street? 1”
|  …             , tipo = “racc. A.C.”
|  …             , luogo = “Varese”
|  …             , oggetto = “Saluti”
|  …             )
|  …         , dict
|  …             ( Spett = “Egregio”
|  …             , contraente = “Second Person”
|  …             , indirizzo = “street? 2”
|  …             , tipo = “Raccomandata”
|  …             , luogo = “Gavirate”
|  …             , oggetto = “Ossequi”
|  …             )
|  …         )
|  …     )
|  …   , renumber_all (o.mimetype)
|  …   , set_meta (o.mimetype)
|  …   , Transforms.Fix_OOo_Tag ()
|  …   )
|  >>> t.transform(o)
|  >>> o.close()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“carta-out.odt”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, mimetype = m)
|  >>> for node in body.findall (vset) :
|  …     name = node.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name, ‘:’, node.text
|  Spett : Spettabile
|  contraente : First person
|  indirizzo : street? 1
|  Spett : Egregio
|  contraente : Second Person
|  indirizzo : street? 2
|  tipo : racc. A.C.
|  luogo : Varese
|  oggetto : Saluti
|  tipo : Raccomandata
|  luogo : Gavirate
|  oggetto : Ossequi
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘test.odt’, outfile = sio)
|  >>> t   = Transformer (
|  …       o.mimetype
|  …     , get_meta (o.mimetype)
|  …     , Transforms.Concatenate (‘test.odt’, ‘rechng.odt’)
|  …     , renumber_all (o.mimetype)
|  …     , set_meta (o.mimetype)
|  …     , Transforms.Fix_OOo_Tag ()
|  …     )
|  >>> t.transform (o)
|  >>> for i in meta_counts :
|  …     print i, repr (t [‘:’.join ((‘Set_Attribute’, i))])
|  character-count ‘1131’
|  image-count ‘0’
|  object-count ‘0’
|  page-count ‘3’
|  paragraph-count ’80’
|  table-count ‘2’
|  word-count ‘159’
|  >>> o.close ()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“testout3.odt”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> s = o.read (‘styles.xml’)
|  >>> for n in c.findall (‘./*/*’) :
|  …     name = n.get (OOo_Tag (‘style’, ‘name’, m))
|  …     if name :
|  …         parent = n.get (OOo_Tag (‘style’, ‘parent-style-name’, m))
|  …         print ‘”%s”, “%s”‘ % (name, parent)
|  “Tahoma1”, “None”
|  “Bitstream Vera Sans”, “None”
|  “Tahoma”, “None”
|  “Nimbus Roman No9 L”, “None”
|  “Courier New”, “None”
|  “Arial Black”, “None”
|  “New Century Schoolbook”, “None”
|  “Times New Roman”, “None”
|  “Arial”, “None”
|  “Helvetica”, “None”
|  “Table1”, “None”
|  “Table1.A”, “None”
|  “Table1.A1”, “None”
|  “Table1.E1”, “None”
|  “Table1.A2”, “None”
|  “Table1.E2”, “None”
|  “P1”, “None”
|  “fr1”, “Frame”
|  “fr2”, “Frame”
|  “Sect1”, “None”
|  “gr1”, “None”
|  “P2”, “Standard”
|  “Standard_Concat”, “None”
|  “Concat_P1”, “Concat_Frame_20_contents”
|  “Concat_P2”, “Concat_Frame_20_contents”
|  “P3”, “Concat_Frame_20_contents”
|  “P4”, “Concat_Standard”
|  “P5”, “Concat_Standard”
|  “P6”, “Concat_Frame_20_contents”
|  “P7”, “Concat_Frame_20_contents”
|  “P8”, “Concat_Frame_20_contents”
|  “P9”, “Concat_Frame_20_contents”
|  “P10”, “Concat_Frame_20_contents”
|  “P11”, “Concat_Frame_20_contents”
|  “P12”, “Concat_Frame_20_contents”
|  “P14”, “Concat_Standard”
|  “P15”, “Concat_Standard”
|  “P16”, “Concat_Standard”
|  “P17”, “Concat_Standard”
|  “P18”, “Concat_Standard”
|  “P19”, “Concat_Standard”
|  “P20”, “Concat_Standard”
|  “P21”, “Concat_Standard”
|  “P22”, “Concat_Standard”
|  “P23”, “Concat_Standard”
|  “Concat_fr1”, “Frame”
|  “Concat_fr2”, “Frame”
|  “fr3”, “Frame”
|  “fr4”, “Frame”
|  “fr5”, “Frame”
|  “fr6”, “Frame”
|  “Concat_gr1”, “None”
|  “N0”, “None”
|  “N2”, “None”
|  “P14_Concat”, “Concat_Standard”
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘variable-decl’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name
|  salutation
|  firstname
|  lastname
|  street
|  country
|  postalcode
|  city
|  date
|  invoice.invoice_no
|  invoice.abo.aboprice.abotype.description
|  address.salutation
|  address.title
|  address.firstname
|  address.lastname
|  address.function
|  address.street
|  address.country
|  address.postalcode
|  address.city
|  invoice.subscriber.salutation
|  invoice.subscriber.title
|  invoice.subscriber.firstname
|  invoice.subscriber.lastname
|  invoice.subscriber.function
|  invoice.subscriber.street
|  invoice.subscriber.country
|  invoice.subscriber.postalcode
|  invoice.subscriber.city
|  invoice.period_start
|  invoice.period_end
|  invoice.currency.name
|  invoice.amount
|  invoice.subscriber.initial
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘sequence-decl’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name
|  Illustration
|  Table
|  Text
|  Drawing
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘p’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘style-name’, m))
|  …     if not name or name.startswith (‘Concat’) :
|  …         print ‘:’.join(split_tag (n.tag)), “>%s<” % name
|  text:p >None<
|  text:p >None<
|  text:p >Concat_P1<
|  text:p >Concat_P1<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_Frame_20_contents<
|  text:p >None<
|  text:p >None<
|  text:p >None<
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘frame’, m)) :
|  …     attrs = ‘name’, ‘style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     attrs.append (n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m)))
|  …     print attrs
|  [‘Frame1’, ‘fr1’, ‘0’, ‘1’]
|  [‘Frame2’, ‘fr1’, ‘3’, ‘2’]
|  [‘Frame3’, ‘Concat_fr1’, ‘6’, ‘3’]
|  [‘Frame4’, ‘Concat_fr2’, ‘7’, ‘3’]
|  [‘Frame5’, ‘fr3’, ‘8’, ‘3’]
|  [‘Frame6’, ‘Concat_fr1’, ‘9’, ‘3’]
|  [‘Frame7’, ‘fr4′, ’10’, ‘3’]
|  [‘Frame8’, ‘fr4′, ’11’, ‘3’]
|  [‘Frame9’, ‘fr4′, ’12’, ‘3’]
|  [‘Frame10’, ‘fr4′, ’13’, ‘3’]
|  [‘Frame11’, ‘fr4′, ’14’, ‘3’]
|  [‘Frame12’, ‘fr4′, ’15’, ‘3’]
|  [‘Frame13’, ‘fr5′, ’16’, ‘3’]
|  [‘Frame14’, ‘fr4′, ’18’, ‘3’]
|  [‘Frame15’, ‘fr4′, ’19’, ‘3’]
|  [‘Frame16’, ‘fr4′, ’20’, ‘3’]
|  [‘Frame17’, ‘fr6′, ’17’, ‘3’]
|  [‘Frame18’, ‘fr4′, ’23’, ‘3’]
|  [‘Frame19’, ‘fr2’, ‘2’, None]
|  [‘Frame20’, ‘fr2’, ‘5’, None]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘section’, m)) :
|  …     attrs = ‘name’, ‘style-name’
|  …     attrs = [n.get (OOo_Tag (‘text’, i, m)) for i in attrs]
|  …     print attrs
|  [‘Section1’, ‘Sect1’]
|  [‘Section2’, ‘Sect1’]
|  [‘Section3’, ‘Sect1’]
|  [‘Section4’, ‘Sect1’]
|  [‘Section5’, ‘Sect1’]
|  [‘Section6’, ‘Sect1’]
|  [‘Section7’, ‘Sect1’]
|  [‘Section8’, ‘Sect1’]
|  [‘Section9’, ‘Sect1’]
|  [‘Section10’, ‘Sect1’]
|  [‘Section11’, ‘Sect1’]
|  [‘Section12’, ‘Sect1’]
|  [‘Section13’, ‘Sect1’]
|  [‘Section14’, ‘Sect1’]
|  [‘Section15’, ‘Sect1’]
|  [‘Section16’, ‘Sect1’]
|  [‘Section17’, ‘Sect1’]
|  [‘Section18’, ‘Sect1’]
|  [‘Section19’, ‘Sect1’]
|  [‘Section20’, ‘Sect1’]
|  [‘Section21’, ‘Sect1’]
|  [‘Section22’, ‘Sect1’]
|  [‘Section23’, ‘Sect1’]
|  [‘Section24’, ‘Sect1’]
|  [‘Section25’, ‘Sect1’]
|  [‘Section26’, ‘Sect1’]
|  [‘Section27’, ‘Sect1’]
|  [‘Section28’, ‘Sect1’]
|  [‘Section29’, ‘Sect1’]
|  [‘Section30’, ‘Sect1’]
|  [‘Section31’, ‘Sect1’]
|  [‘Section32’, ‘Sect1’]
|  [‘Section33’, ‘Sect1’]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘rect’, m)) :
|  …     attrs = ‘style-name’, ‘text-style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     attrs.append (n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m)))
|  …     print attrs
|  [‘gr1’, ‘P1’, ‘1’, ‘1’]
|  [‘gr1’, ‘P1’, ‘4’, ‘2’]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘line’, m)) :
|  …     attrs = ‘style-name’, ‘text-style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     print attrs
|  [‘Concat_gr1’, ‘P1′, ’24’]
|  [‘Concat_gr1’, ‘P1′, ’22’]
|  [‘Concat_gr1’, ‘P1′, ’21’]
|  >>> for n in s.findall (‘.//’ + OOo_Tag (‘style’, ‘style’, m)) :
|  …     if n.get (OOo_Tag (‘style’, ‘name’, m)).startswith (‘Co’) :
|  …         attrs = ‘name’, ‘display-name’, ‘class’, ‘family’
|  …         attrs = [n.get (OOo_Tag (‘style’, i, m)) for i in attrs]
|  …         print attrs
|  …         props = n.find (‘./’ + OOo_Tag (‘style’, ‘properties’, m))
|  …         if props is not None and len (props) :
|  …             props [0].tag
|  [‘Concat_Standard’, None, ‘text’, ‘paragraph’]
|  [‘Concat_Text_20_body’, ‘Concat Text body’, ‘text’, ‘paragraph’]
|  [‘Concat_List’, None, ‘list’, ‘paragraph’]
|  [‘Concat_Caption’, None, ‘extra’, ‘paragraph’]
|  [‘Concat_Frame_20_contents’, ‘Concat Frame contents’, ‘extra’, ‘paragraph’]
|  [‘Concat_Index’, None, ‘index’, ‘paragraph’]
|  >>> for n in c.findall (‘.//*’) :
|  …     zidx = n.get (OOo_Tag (‘draw’, ‘z-index’, m))
|  …     if zidx :
|  …         print ‘:’.join(split_tag (n.tag)), zidx
|  draw:frame 0
|  draw:rect 1
|  draw:frame 3
|  draw:rect 4
|  draw:frame 6
|  draw:frame 7
|  draw:frame 8
|  draw:frame 9
|  draw:frame 10
|  draw:frame 11
|  draw:frame 12
|  draw:frame 13
|  draw:frame 14
|  draw:frame 15
|  draw:frame 16
|  draw:frame 18
|  draw:frame 19
|  draw:frame 20
|  draw:frame 17
|  draw:frame 23
|  draw:line 24
|  draw:frame 2
|  draw:frame 5
|  draw:line 22
|  draw:line 21
|  >>> from os import system
|  >>> system (‘python ./ooo_fieldreplace -i test.odt -o testout.odt ‘
|  …         ‘salutation=Frau firstname=Erika lastname=Musterfrau ‘
|  …         ‘country=D postalcode=00815 city=Niemandsdorf ‘
|  …         ‘street=”Beispielstrasse 42″‘)
|  0
|  >>> o = OOoPy (infile = ‘testout.odt’)
|  >>> c = o.read (‘content.xml’)
|  >>> m = o.mimetype
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, mimetype = m)
|  >>> for node in body.findall (vset) :
|  …     name = node.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name, ‘:’, node.text
|  salutation : Frau
|  firstname : Erika
|  lastname : Musterfrau
|  street : Beispielstrasse 42
|  country : D
|  postalcode : 00815
|  city : Niemandsdorf
|  salutation : Frau
|  firstname : Erika
|  lastname : Musterfrau
|  street : Beispielstrasse 42
|  country : D
|  postalcode : 00815
|  city : Niemandsdorf
|  >>> o.close ()
|  >>> system (“./ooo_mailmerge -o testout.odt -d, carta.odt x.csv”)
|  0
|  >>> o = OOoPy (infile = ‘testout.odt’)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, mimetype = m)
|  >>> for node in body.findall (vset) :
|  …     name = node.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name, ‘:’, node.text
|  Spett : Spettabile
|  contraente : First person
|  indirizzo : street? 1
|  Spett : Egregio
|  contraente : Second Person
|  indirizzo : street? 2
|  tipo : racc. A.C.
|  luogo : Varese
|  oggetto : Saluti
|  tipo : Raccomandata
|  luogo : Gavirate
|  oggetto : Ossequi
|  >>> o.close ()
|  >>> o   = OOoPy (infile = ‘testenum.odt’, outfile = ‘xyzzy.odt’)
|  >>> t   = Transformer (
|  …       o.mimetype
|  …     , get_meta (o.mimetype)
|  …     , Transforms.Addpagebreak_Style ()
|  …     , Transforms.Mailmerge
|  …       ( iterator =
|  …         ( dict (firstname = ‘Erika’, lastname = ‘Nobody’)
|  …         , dict (firstname = ‘Eric’,  lastname = ‘Wizard’)
|  …         , cb
|  …         )
|  …       )
|  …     , renumber_all (o.mimetype)
|  …     , set_meta (o.mimetype)
|  …     , Transforms.Fix_OOo_Tag ()
|  …     )
|  >>> t.transform (o)
|  >>> o.close ()
|  >>> o = OOoPy (infile = ‘xyzzy.odt’)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> textlist = ‘.//’ + OOo_Tag (‘text’, ‘list’, m)
|  >>> for node in body.findall (textlist) :
|  …     id = node.get (OOo_Tag (‘xml’, ‘id’, m))
|  …     print ‘xml:id’, ‘:’, id
|  xml:id : list1
|  xml:id : list2
|  xml:id : list3
|
|  Method resolution order:
|      Transformer
|      ooopy.OOoPy.autosuper
|      __builtin__.object
|
|  Methods defined here:
|
|  __getitem__(self, key)
|
|  __init__(self, mimetype, *tf)
|
|  __setitem__(self, key, value)
|
|  insert(self, transform)
|      Insert a new transform
|
|  transform(self, ooopy)
|      Apply all the transforms in priority order.
|      Priority order is global over all transforms.
|
|  ———————————————————————-
|  Data descriptors inherited from ooopy.OOoPy.autosuper:
|
|  __dict__
|      dictionary for instance variables (if defined)
|
|  __weakref__
|      list of weak references to the object (if defined)
|
|  ———————————————————————-
|  Data and other attributes inherited from ooopy.OOoPy.autosuper:
|
|  __metaclass__ = <class ‘ooopy.OOoPy._autosuper’>

 

About José Antonio Meira da Rocha

Jornalista, professor das áreas de Editoração e de Mídias Digitais na Universidade Federal de Santa Maria, campus cidade de Frederico Westphalen, Rio Grande do Sul, Brasil. Doutor em Design pelo Programa de Pós-Graduação em Design (PGDesign)/Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brasil, 2023. Mestre em Mídias pela UNISINOS, São Leopoldo, RS, Brasil, 2003. Especialista em Informática na Educação, Unisinos, 1976.