Categories: Geral

Como ler documentos do BrOffice por scripts no Scribus

O Scribus importa documentos de texto do BrOffice / LibreOffice / OpenOffice.org, mas apenas manualmente, pelo menu “Arquivo > Abrir”. Se quiser importar por script, em alguma tarefa automatizada, tenho que usar o módulo Python apropriado.

Este módulo é o OOoPy, cujo projeto reside em http://sourceforge.net/projects/ooopy/.

[N. do autor: artigo em expansão]

Para baixar e instalar o módulo OOoPy no Python de meu Ubuntu, abri o terminal e usei os seguintes comandos:

wget http://ufpr.dl.sourceforge.net/project/ooopy/ooopy/1.6.7680/OOoPy-1.6.7680.tar.gz
tar -vzxf ./OOoPy-1.6.7680.tar.gz
cd ./OOoPy-1.6.7680
sudo python setup.py install

Devo ajustar os números da versão mais nova, se tentar isto futuramente.

O resultado da instalação, no Ubuntu, deve ser algo como:

running install
running build
running build_py
creating build
creating build/lib.linux-i686-2.6
creating build/lib.linux-i686-2.6/ooopy
copying ooopy/Transformer.py -> build/lib.linux-i686-2.6/ooopy
copying ooopy/OOoPy.py -> build/lib.linux-i686-2.6/ooopy
copying ooopy/Version.py -> build/lib.linux-i686-2.6/ooopy
copying ooopy/Transforms.py -> build/lib.linux-i686-2.6/ooopy
copying ooopy/__init__.py -> build/lib.linux-i686-2.6/ooopy
running build_scripts
creating build/scripts-2.6
copying and adjusting ooo_as_text -> build/scripts-2.6
copying and adjusting ooo_cat -> build/scripts-2.6
copying and adjusting ooo_fieldreplace -> build/scripts-2.6
copying ooo_grep -> build/scripts-2.6
copying and adjusting ooo_mailmerge -> build/scripts-2.6
changing mode of build/scripts-2.6/ooo_as_text from 644 to 755
changing mode of build/scripts-2.6/ooo_cat from 644 to 755
changing mode of build/scripts-2.6/ooo_fieldreplace from 644 to 755
changing mode of build/scripts-2.6/ooo_mailmerge from 644 to 755
running install_lib
creating /usr/local/lib/python2.6/dist-packages/ooopy
copying build/lib.linux-i686-2.6/ooopy/Transformer.py -> /usr/local/lib/python2.6/dist-packages/ooopy
copying build/lib.linux-i686-2.6/ooopy/OOoPy.py -> /usr/local/lib/python2.6/dist-packages/ooopy
copying build/lib.linux-i686-2.6/ooopy/Version.py -> /usr/local/lib/python2.6/dist-packages/ooopy
copying build/lib.linux-i686-2.6/ooopy/Transforms.py -> /usr/local/lib/python2.6/dist-packages/ooopy
copying build/lib.linux-i686-2.6/ooopy/__init__.py -> /usr/local/lib/python2.6/dist-packages/ooopy
byte-compiling /usr/local/lib/python2.6/dist-packages/ooopy/Transformer.py to Transformer.pyc
byte-compiling /usr/local/lib/python2.6/dist-packages/ooopy/OOoPy.py to OOoPy.pyc
byte-compiling /usr/local/lib/python2.6/dist-packages/ooopy/Version.py to Version.pyc
byte-compiling /usr/local/lib/python2.6/dist-packages/ooopy/Transforms.py to Transforms.pyc
byte-compiling /usr/local/lib/python2.6/dist-packages/ooopy/__init__.py to __init__.pyc
running install_scripts
copying build/scripts-2.6/ooo_cat -> /usr/local/bin
copying build/scripts-2.6/ooo_mailmerge -> /usr/local/bin
copying build/scripts-2.6/ooo_fieldreplace -> /usr/local/bin
copying build/scripts-2.6/ooo_as_text -> /usr/local/bin
copying build/scripts-2.6/ooo_grep -> /usr/local/bin
changing mode of /usr/local/bin/ooo_cat to 755
changing mode of /usr/local/bin/ooo_mailmerge to 755
changing mode of /usr/local/bin/ooo_fieldreplace to 755
changing mode of /usr/local/bin/ooo_as_text to 755
changing mode of /usr/local/bin/ooo_grep to 755
running install_data
creating /usr/local/share/ooopy
copying test.sxw -> /usr/local/share/ooopy
copying carta.stw -> /usr/local/share/ooopy
copying test.odt -> /usr/local/share/ooopy
copying carta.odt -> /usr/local/share/ooopy
copying rechng.sxw -> /usr/local/share/ooopy
copying rechng.odt -> /usr/local/share/ooopy
copying run_doctest.py -> /usr/local/share/ooopy
copying x.csv -> /usr/local/share/ooopy
running install_egg_info
Writing /usr/local/lib/python2.6/dist-packages/OOoPy-1.6.7680.egg-info

Para baixar e instalar no Windows, é mais complicado. Como eu tenho o baixador wget e o compactador 7z, a sequência de comandos que usei foi:

wget http://ufpr.dl.sourceforge.net/project/ooopy/ooopy/1.6.7680/OOoPy-1.6.7680.tar.gz
"%PROGRAMFILES%"\7-Zip\7z.exe x OOoPy-1.6.7680.tar.gz
"%PROGRAMFILES%"\7-Zip\7z.exe x -o"%PROGRAMFILES%"\"Scribus 1.3.9"\lib\site-packages\ OOoPy-1.6.7680.tar
del OOoPy-1.6.7680.tar

Executei o arquivo setup.py pelo Scribus Scripter (menu “Scrip > Executar script..”, procurei o arquivo em %PROGRAMFILES%”\”Scribus 1.3.9″\lib\site-packages\OOoPy-1.6.7680\setup.py).

A seguir, copiei a pasta %PROGRAMFILES%”\”Scribus 1.3.9″\lib\site-packages\OOoPy-1.6.7680\ooopy\ para a pasta %PROGRAMFILES%”\”Scribus 1.3.9″\lib\site-packages. Este método é meio gambiarra, pois não cria o arquivo egg-info padrão criado no Linux.

Outra opção é instalar o módulo no Python padrão e copiar os arquivos instalados para o Python do Scribus:

c:\>cd "c:\Python26\Lib\site-packages\OOoPy-1.6.7680\"

C:\Python26\Lib\site-packages\OOoPy-1.6.7680>c:\Python26\python.exe setup.py install
running install
running build
running build_py
creating build
creating build\lib
creating build\lib\ooopy
copying ooopy\OOoPy.py -> build\lib\ooopy
copying ooopy\Transformer.py -> build\lib\ooopy
copying ooopy\Transforms.py -> build\lib\ooopy
copying ooopy\Version.py -> build\lib\ooopy
copying ooopy\__init__.py -> build\lib\ooopy
running build_scripts
creating build\scripts-2.6
copying and adjusting ooo_as_text -> build\scripts-2.6
copying and adjusting ooo_cat -> build\scripts-2.6
copying and adjusting ooo_fieldreplace -> build\scripts-2.6
copying ooo_grep -> build\scripts-2.6
copying and adjusting ooo_mailmerge -> build\scripts-2.6
running install_lib
creating c:\Python26\Lib\site-packages\ooopy
copying build\lib\ooopy\OOoPy.py -> c:\Python26\Lib\site-packages\ooopy
copying build\lib\ooopy\Transformer.py -> c:\Python26\Lib\site-packages\ooopy
copying build\lib\ooopy\Transforms.py -> c:\Python26\Lib\site-packages\ooopy
copying build\lib\ooopy\Version.py -> c:\Python26\Lib\site-packages\ooopy
copying build\lib\ooopy\__init__.py -> c:\Python26\Lib\site-packages\ooopy
byte-compiling c:\Python26\Lib\site-packages\ooopy\OOoPy.py to OOoPy.pyc
byte-compiling c:\Python26\Lib\site-packages\ooopy\Transformer.py to Transformer.pyc
byte-compiling c:\Python26\Lib\site-packages\ooopy\Transforms.py to Transforms.pyc
byte-compiling c:\Python26\Lib\site-packages\ooopy\Version.py to Version.pyc
byte-compiling c:\Python26\Lib\site-packages\ooopy\__init__.py to __init__.pyc
running install_scripts
copying build\scripts-2.6\ooo_as_text -> c:\Python26\Scripts
copying build\scripts-2.6\ooo_cat -> c:\Python26\Scripts
copying build\scripts-2.6\ooo_fieldreplace -> c:\Python26\Scripts
copying build\scripts-2.6\ooo_grep -> c:\Python26\Scripts
copying build\scripts-2.6\ooo_mailmerge -> c:\Python26\Scripts
running install_data
creating c:\Python26\share
creating c:\Python26\share\ooopy
copying test.sxw -> c:\Python26\share\ooopy
copying carta.stw -> c:\Python26\share\ooopy
copying test.odt -> c:\Python26\share\ooopy
copying carta.odt -> c:\Python26\share\ooopy
copying rechng.sxw -> c:\Python26\share\ooopy
copying rechng.odt -> c:\Python26\share\ooopy
copying run_doctest.py -> c:\Python26\share\ooopy
copying x.csv -> c:\Python26\share\ooopy
running install_egg_info
Writing c:\Python26\Lib\site-packages\OOoPy-1.6.7680-py2.6.egg-info

 

Para saber como funciona o módulo OOoPy, usei, pelo console do Scripter Scribus, os comandos:

from ooopy.OOoPy import OOoPy
help (OOoPy)
from ooopy.Transformer import Transformer
help (Transformer)

No Ubuntu 10, tenho que inserir os paths antes de usar este código, devido a um bug do próprio Ubuntu:

sys.path.insert(0,'/usr/lib/python2.6/')
sys.path.insert(0,'')

Help do OOoPy

Help on class OOoPy in module ooopy.OOoPy:

class OOoPy(autosuper)
|  Wrapper for OpenOffice.org zip files (all OOo documents are
|  really zip files internally).
|
|  from ooopy.OOoPy import OOoPy
|  >>> o = OOoPy (infile = ‘test.sxw’, outfile = ‘out.sxw’)
|  >>> o.mimetype
|  ‘application/vnd.sun.xml.writer’
|  >>> for f in files :
|  …     e = o.read (f)
|  …     e.write ()
|  …
|  >>> o.close ()
|  >>> o = OOoPy (infile = ‘test.odt’, outfile = ‘out2.odt’)
|  >>> o.mimetype
|  ‘application/vnd.oasis.opendocument.text’
|  >>> for f in files :
|  …     e = o.read (f)
|  …     e.write ()
|  …
|  >>> o.close ()
|  >>> o = OOoPy (infile = ‘out2.odt’)
|  >>> for f in o.izip.infolist () :
|  …     print f.filename, f.create_system
|  mimetype 0
|  content.xml 0
|  styles.xml 0
|  meta.xml 0
|  settings.xml 0
|  META-INF/manifest.xml 0
|  Configurations2/statusbar/ 0
|  Configurations2/accelerator/current.xml 0
|  Configurations2/floater/ 0
|  Configurations2/popupmenu/ 0
|  Configurations2/progressbar/ 0
|  Configurations2/menubar/ 0
|  Configurations2/toolbar/ 0
|  Configurations2/images/Bitmaps/ 0
|  Thumbnails/thumbnail.png 0
|  >>> for f in o.izip.infolist () :
|  …     print f.filename, f.compress_type, f.compress_size, f.file_size
|  mimetype 8 41 39
|  content.xml 8 1930 16212
|  styles.xml 8 1888 12743
|  meta.xml 8 436 1545
|  settings.xml 8 1376 7862
|  META-INF/manifest.xml 8 286 1845
|  Configurations2/statusbar/ 0 0 0
|  Configurations2/accelerator/current.xml 8 2 0
|  Configurations2/floater/ 0 0 0
|  Configurations2/popupmenu/ 0 0 0
|  Configurations2/progressbar/ 0 0 0
|  Configurations2/menubar/ 0 0 0
|  Configurations2/toolbar/ 0 0 0
|  Configurations2/images/Bitmaps/ 0 0 0
|  Thumbnails/thumbnail.png 8 2145 2367
|
|  Method resolution order:
|      OOoPy
|      autosuper
|      __builtin__.object
|
|  Methods defined here:
|
|  __del__ = close(self)
|
|  __init__(self, infile=None, outfile=None, write_mode=’w’, mimetype=None)
|      Open an OOo document, if no outfile is given, we open the
|      file read-only. Otherwise the outfile has to be different
|      from the infile — the python ZipFile can’t deal with
|      read-write access. In case an outfile is given, we open it
|      in “w” mode as a zip file, unless write_mode is specified
|      (the only allowed case would be “a” for appending to an
|      existing file, see pythons ZipFile documentation for
|      details). If no infile is given, the user is responsible for
|      providing all necessary files in the resulting output file.
|
|      It seems that OOo needs to have the mimetype as the first
|      archive member (at least with mimetype as the first member
|      it works, the order may not be arbitrary) to recognize a zip
|      archive as an OOo file. When copying from a given infile, we
|      use the same order of elements in the resulting output. When
|      creating new elements we make sure the mimetype is the first
|      in the resulting archive.
|
|      Note that both, infile and outfile can either be filenames
|      or file-like objects (e.g. StringIO).
|
|      The mimetype is automatically determined if an infile is
|      given. If only writing is desired, the mimetype should be
|      set.
|
|  close(self)
|      Close the zip files. According to documentation of zipfile in
|      the standard python lib, this has to be done to be sure
|      everything is written. We copy over the not-yet written files
|      from izip before closing ozip.
|
|  read(self, zname)
|      return an OOoElementTree object for the given OOo document
|      archive member name. Currently an OOo document contains the
|      following XML files::
|
|       * content.xml: the text of the OOo document
|       * styles.xml: style definitions
|       * meta.xml: meta-information (author, last changed, …)
|       * settings.xml: settings in OOo
|       * META-INF/manifest.xml: contents of the archive
|
|      There is an additional file “mimetype” that always contains
|      the string “application/vnd.sun.xml.writer” for OOo 1.X files
|      and the string “application/vnd.oasis.opendocument.text” for
|      OOo 2.X files.
|
|  write(self, zname, etree)
|
|  ———————————————————————-
|  Data descriptors inherited from autosuper:
|
|  __dict__
|      dictionary for instance variables (if defined)
|
|  __weakref__
|      list of weak references to the object (if defined)
|
|  ———————————————————————-
|  Data and other attributes inherited from autosuper:
|
|  __metaclass__ = <class ‘ooopy.OOoPy._autosuper’>

Help do Transformer

Help on class Transformer in module ooopy.Transformer:

class Transformer(ooopy.OOoPy.autosuper)
|  Class for applying a set of transforms to a given ooopy object.
|  The transforms are applied to the specified file in priority
|  order. When applying transforms we have a mechanism for
|  communication of transforms. We give the transformer to the
|  individual transforms as a parameter. The transforms may use the
|  transformer like a dictionary for storing values and retrieving
|  values left by previous transforms.
|  As a naming convention each transform should use its class name
|  as a prefix for storing values in the dictionary.
|  >>> import Transforms
|  >>> from Transforms import renumber_all, get_meta, set_meta, meta_counts
|  >>> from StringIO import StringIO
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘test.sxw’, outfile = sio)
|  >>> m   = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> body [-1].get (OOo_Tag (‘text’, ‘style-name’, mimetype = m))
|  ‘Standard’
|  >>> def cb (name) :
|  …     r = { ‘street’     : ‘Beispielstrasse 42’
|  …         , ‘firstname’  : ‘Hugo’
|  …         , ‘salutation’ : ‘Frau’
|  …         }
|  …     if r.has_key (name) : return r [name]
|  …     return None
|  …
|  >>> p = get_meta (m)
|  >>> t = Transformer (m, p)
|  >>> t [‘a’] = ‘a’
|  >>> t [‘a’]
|  ‘a’
|  >>> t.transform (o)
|  >>> p.set (‘a’, ‘b’)
|  >>> t [‘Attribute_Access:a’]
|  ‘b’
|  >>> t   = Transformer (
|  …       m
|  …     , Transforms.Autoupdate ()
|  …     , Transforms.Editinfo   ()
|  …     , Transforms.Field_Replace (prio = 99, replace = cb)
|  …     , Transforms.Field_Replace
|  …         ( replace =
|  …             { ‘salutation’ : ”
|  …             , ‘firstname’  : ‘Erika’
|  …             , ‘lastname’   : ‘Musterfrau’
|  …             , ‘country’    : ‘D’
|  …             , ‘postalcode’ : ‘00815’
|  …             , ‘city’       : ‘Niemandsdorf’
|  …             }
|  …         )
|  …     , Transforms.Addpagebreak_Style ()
|  …     , Transforms.Addpagebreak       ()
|  …     )
|  >>> t.transform (o)
|  >>> o.close ()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“testout.sxw”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> c = o.read (‘content.xml’)
|  >>> m = o.mimetype
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, mimetype = m)
|  >>> for node in body.findall (vset) :
|  …     name = node.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name, ‘:’, node.text
|  salutation : None
|  firstname : Erika
|  lastname : Musterfrau
|  street : Beispielstrasse 42
|  country : D
|  postalcode : 00815
|  city : Niemandsdorf
|  salutation : None
|  firstname : Erika
|  lastname : Musterfrau
|  street : Beispielstrasse 42
|  country : D
|  postalcode : 00815
|  city : Niemandsdorf
|  >>> body [-1].get (OOo_Tag (‘text’, ‘style-name’, mimetype = m))
|  ‘P2’
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘test.sxw’, outfile = sio)
|  >>> c = o.read (‘content.xml’)
|  >>> t   = Transformer (
|  …       o.mimetype
|  …     , get_meta (o.mimetype)
|  …     , Transforms.Addpagebreak_Style ()
|  …     , Transforms.Mailmerge
|  …       ( iterator =
|  …         ( dict (firstname = ‘Erika’, lastname = ‘Nobody’)
|  …         , dict (firstname = ‘Eric’,  lastname = ‘Wizard’)
|  …         , cb
|  …         )
|  …       )
|  …     , renumber_all (o.mimetype)
|  …     , set_meta (o.mimetype)
|  …     , Transforms.Fix_OOo_Tag ()
|  …     )
|  >>> t.transform (o)
|  >>> for i in meta_counts :
|  …     print i, t [‘:’.join ((‘Set_Attribute’, i))]
|  character-count 951
|  image-count 0
|  object-count 0
|  page-count 3
|  paragraph-count 113
|  table-count 3
|  word-count 162
|  >>> name = t [‘Addpagebreak_Style:stylename’]
|  >>> name
|  ‘P2’
|  >>> o.close ()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“testout2.sxw”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, m))
|  >>> for n in body.findall (‘.//*’) :
|  …     zidx = n.get (OOo_Tag (‘draw’, ‘z-index’, m))
|  …     if zidx :
|  …         print ‘:’.join(split_tag (n.tag)), zidx
|  draw:text-box 0
|  draw:rect 1
|  draw:text-box 3
|  draw:rect 4
|  draw:text-box 6
|  draw:rect 7
|  draw:text-box 2
|  draw:text-box 5
|  draw:text-box 8
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘text’, ‘p’, m)) :
|  …     if n.get (OOo_Tag (‘text’, ‘style-name’, m)) == name :
|  …         print n.tag
|  {http://openoffice.org/2000/text}p
|  {http://openoffice.org/2000/text}p
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, m)
|  >>> for n in body.findall (vset) :
|  …     if n.get (OOo_Tag (‘text’, ‘name’, m), None).endswith (‘name’) :
|  …         name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …         print name, ‘:’, n.text
|  firstname : Erika
|  lastname : Nobody
|  firstname : Eric
|  lastname : Wizard
|  firstname : Hugo
|  lastname : Testman
|  firstname : Erika
|  lastname : Nobody
|  firstname : Eric
|  lastname : Wizard
|  firstname : Hugo
|  lastname : Testman
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘draw’, ‘text-box’, m)) :
|  …     print n.get (OOo_Tag (‘draw’, ‘name’, m)),
|  …     print n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m))
|  Frame1 1
|  Frame2 2
|  Frame3 3
|  Frame4 None
|  Frame5 None
|  Frame6 None
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘text’, ‘section’, m)) :
|  …     print n.get (OOo_Tag (‘text’, ‘name’, m))
|  Section1
|  Section2
|  Section3
|  Section4
|  Section5
|  Section6
|  Section7
|  Section8
|  Section9
|  Section10
|  Section11
|  Section12
|  Section13
|  Section14
|  Section15
|  Section16
|  Section17
|  Section18
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘table’, ‘table’, m)) :
|  …     print n.get (OOo_Tag (‘table’, ‘name’, m))
|  Table1
|  Table2
|  Table3
|  >>> r = o.read (‘meta.xml’)
|  >>> meta = r.find (‘.//’ + OOo_Tag (‘meta’, ‘document-statistic’, m))
|  >>> for i in meta_counts :
|  …     print i, repr (meta.get (OOo_Tag (‘meta’, i, m)))
|  character-count ‘951’
|  image-count ‘0’
|  object-count ‘0’
|  page-count ‘3’
|  paragraph-count ‘113’
|  table-count ‘3’
|  word-count ‘162’
|  >>> o.close ()
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘test.sxw’, outfile = sio)
|  >>> t   = Transformer (
|  …       o.mimetype
|  …     , get_meta (o.mimetype)
|  …     , Transforms.Concatenate (‘test.sxw’, ‘rechng.sxw’)
|  …     , renumber_all (o.mimetype)
|  …     , set_meta (o.mimetype)
|  …     , Transforms.Fix_OOo_Tag ()
|  …     )
|  >>> t.transform (o)
|  >>> for i in meta_counts :
|  …     print i, repr (t [‘:’.join ((‘Set_Attribute’, i))])
|  character-count ‘1131’
|  image-count ‘0’
|  object-count ‘0’
|  page-count ‘3’
|  paragraph-count ‘168’
|  table-count ‘2’
|  word-count ‘160’
|  >>> o.close ()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“testout3.sxw”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> s = o.read (‘styles.xml’)
|  >>> for n in c.findall (‘./*/*’) :
|  …     name = n.get (OOo_Tag (‘style’, ‘name’, m))
|  …     if name :
|  …         parent = n.get (OOo_Tag (‘style’, ‘parent-style-name’, m))
|  …         print ‘”%s”, “%s”‘ % (name, parent)
|  “Tahoma1”, “None”
|  “Bitstream Vera Sans”, “None”
|  “Tahoma”, “None”
|  “Nimbus Roman No9 L”, “None”
|  “Courier New”, “None”
|  “Arial Black”, “None”
|  “New Century Schoolbook”, “None”
|  “Helvetica”, “None”
|  “Table1”, “None”
|  “Table1.A”, “None”
|  “Table1.A1”, “None”
|  “Table1.E1”, “None”
|  “Table1.A2”, “None”
|  “Table1.E2”, “None”
|  “P1”, “None”
|  “fr1”, “Frame”
|  “fr2”, “None”
|  “fr3”, “Frame”
|  “Sect1”, “None”
|  “gr1”, “None”
|  “P2”, “Standard”
|  “Standard_Concat”, “None”
|  “Concat_P1”, “Concat_Frame contents”
|  “Concat_P2”, “Concat_Frame contents”
|  “P3”, “Concat_Frame contents”
|  “P4”, “Concat_Frame contents”
|  “P5”, “Concat_Standard”
|  “P6”, “Concat_Standard”
|  “P7”, “Concat_Frame contents”
|  “P8”, “Concat_Frame contents”
|  “P9”, “Concat_Frame contents”
|  “P10”, “Concat_Frame contents”
|  “P11”, “Concat_Frame contents”
|  “P12”, “Concat_Frame contents”
|  “P13”, “Concat_Frame contents”
|  “P15”, “Concat_Standard”
|  “P16”, “Concat_Standard”
|  “P17”, “Concat_Standard”
|  “P18”, “Concat_Standard”
|  “P19”, “Concat_Standard”
|  “P20”, “Concat_Standard”
|  “P21”, “Concat_Standard”
|  “P22”, “Concat_Standard”
|  “P23”, “Concat_Standard”
|  “T1”, “None”
|  “Concat_fr1”, “Concat_Frame”
|  “Concat_fr2”, “Concat_Frame”
|  “Concat_fr3”, “Concat_Frame”
|  “fr4”, “Concat_Frame”
|  “fr5”, “Concat_Frame”
|  “fr6”, “Concat_Frame”
|  “Concat_Sect1”, “None”
|  “N0”, “None”
|  “N2”, “None”
|  “P15_Concat”, “Concat_Standard”
|  >>> for n in s.findall (‘./*/*’) :
|  …     name = n.get (OOo_Tag (‘style’, ‘name’, m))
|  …     if name :
|  …         parent = n.get (OOo_Tag (‘style’, ‘parent-style-name’, m))
|  …         print ‘”%s”, “%s”‘ % (name, parent)
|  “Tahoma1”, “None”
|  “Bitstream Vera Sans”, “None”
|  “Tahoma”, “None”
|  “Nimbus Roman No9 L”, “None”
|  “Courier New”, “None”
|  “Arial Black”, “None”
|  “New Century Schoolbook”, “None”
|  “Helvetica”, “None”
|  “Standard”, “None”
|  “Text body”, “Standard”
|  “List”, “Text body”
|  “Table Contents”, “Text body”
|  “Table Heading”, “Table Contents”
|  “Caption”, “Standard”
|  “Frame contents”, “Text body”
|  “Index”, “Standard”
|  “Frame”, “None”
|  “OLE”, “None”
|  “Concat_Standard”, “None”
|  “Concat_Text body”, “Concat_Standard”
|  “Concat_List”, “Concat_Text body”
|  “Concat_Caption”, “Concat_Standard”
|  “Concat_Frame contents”, “Concat_Text body”
|  “Concat_Index”, “Concat_Standard”
|  “Horizontal Line”, “Concat_Standard”
|  “Internet link”, “None”
|  “Visited Internet Link”, “None”
|  “Concat_Frame”, “None”
|  “Concat_OLE”, “None”
|  “pm1”, “None”
|  “Concat_pm1”, “None”
|  “Standard”, “None”
|  “Concat_Standard”, “None”
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘variable-decl’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name
|  salutation
|  firstname
|  lastname
|  street
|  country
|  postalcode
|  city
|  date
|  invoice.invoice_no
|  invoice.abo.aboprice.abotype.description
|  address.salutation
|  address.title
|  address.firstname
|  address.lastname
|  address.function
|  address.street
|  address.country
|  address.postalcode
|  address.city
|  invoice.subscriber.salutation
|  invoice.subscriber.title
|  invoice.subscriber.firstname
|  invoice.subscriber.lastname
|  invoice.subscriber.function
|  invoice.subscriber.street
|  invoice.subscriber.country
|  invoice.subscriber.postalcode
|  invoice.subscriber.city
|  invoice.period_start
|  invoice.period_end
|  invoice.currency.name
|  invoice.amount
|  invoice.subscriber.initial
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘sequence-decl’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name
|  Illustration
|  Table
|  Text
|  Drawing
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘p’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘style-name’, m))
|  …     if not name or name.startswith (‘Concat’) :
|  …         print “>%s<” % name
|  >Concat_P1<
|  >Concat_P2<
|  >Concat_Frame contents<
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘text-box’, m)) :
|  …     attrs = ‘name’, ‘style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     attrs.append (n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m)))
|  …     print attrs
|  [‘Frame1’, ‘fr1’, ‘0’, ‘1’]
|  [‘Frame2’, ‘fr1’, ‘3’, ‘2’]
|  [‘Frame3’, ‘Concat_fr1’, ‘6’, ‘3’]
|  [‘Frame4’, ‘Concat_fr2’, ‘7’, ‘3’]
|  [‘Frame5’, ‘Concat_fr3’, ‘8’, ‘3’]
|  [‘Frame6’, ‘Concat_fr1’, ‘9’, ‘3’]
|  [‘Frame7’, ‘fr4′, ’10’, ‘3’]
|  [‘Frame8’, ‘fr4′, ’11’, ‘3’]
|  [‘Frame9’, ‘fr4′, ’12’, ‘3’]
|  [‘Frame10’, ‘fr4′, ’13’, ‘3’]
|  [‘Frame11’, ‘fr4′, ’14’, ‘3’]
|  [‘Frame12’, ‘fr4′, ’15’, ‘3’]
|  [‘Frame13’, ‘fr5′, ’16’, ‘3’]
|  [‘Frame14’, ‘fr4′, ’18’, ‘3’]
|  [‘Frame15’, ‘fr4′, ’19’, ‘3’]
|  [‘Frame16’, ‘fr4′, ’20’, ‘3’]
|  [‘Frame17’, ‘fr6′, ’17’, ‘3’]
|  [‘Frame18’, ‘fr4′, ’23’, ‘3’]
|  [‘Frame19’, ‘fr3’, ‘2’, None]
|  [‘Frame20’, ‘fr3’, ‘5’, None]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘section’, m)) :
|  …     attrs = ‘name’, ‘style-name’
|  …     attrs = [n.get (OOo_Tag (‘text’, i, m)) for i in attrs]
|  …     print attrs
|  [‘Section1’, ‘Sect1’]
|  [‘Section2’, ‘Sect1’]
|  [‘Section3’, ‘Sect1’]
|  [‘Section4’, ‘Sect1’]
|  [‘Section5’, ‘Sect1’]
|  [‘Section6’, ‘Sect1’]
|  [‘Section7’, ‘Concat_Sect1’]
|  [‘Section8’, ‘Concat_Sect1’]
|  [‘Section9’, ‘Concat_Sect1’]
|  [‘Section10’, ‘Concat_Sect1’]
|  [‘Section11’, ‘Concat_Sect1’]
|  [‘Section12’, ‘Concat_Sect1’]
|  [‘Section13’, ‘Concat_Sect1’]
|  [‘Section14’, ‘Concat_Sect1’]
|  [‘Section15’, ‘Concat_Sect1’]
|  [‘Section16’, ‘Concat_Sect1’]
|  [‘Section17’, ‘Concat_Sect1’]
|  [‘Section18’, ‘Concat_Sect1’]
|  [‘Section19’, ‘Concat_Sect1’]
|  [‘Section20’, ‘Concat_Sect1’]
|  [‘Section21’, ‘Concat_Sect1’]
|  [‘Section22’, ‘Concat_Sect1’]
|  [‘Section23’, ‘Concat_Sect1’]
|  [‘Section24’, ‘Concat_Sect1’]
|  [‘Section25’, ‘Concat_Sect1’]
|  [‘Section26’, ‘Concat_Sect1’]
|  [‘Section27’, ‘Concat_Sect1’]
|  [‘Section28’, ‘Sect1’]
|  [‘Section29’, ‘Sect1’]
|  [‘Section30’, ‘Sect1’]
|  [‘Section31’, ‘Sect1’]
|  [‘Section32’, ‘Sect1’]
|  [‘Section33’, ‘Sect1’]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘rect’, m)) :
|  …     attrs = ‘style-name’, ‘text-style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     attrs.append (n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m)))
|  …     print attrs
|  [‘gr1’, ‘P1’, ‘1’, ‘1’]
|  [‘gr1’, ‘P1’, ‘4’, ‘2’]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘line’, m)) :
|  …     attrs = ‘style-name’, ‘text-style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     print attrs
|  [‘gr1’, ‘P1′, ’24’]
|  [‘gr1’, ‘P1′, ’22’]
|  [‘gr1’, ‘P1′, ’21’]
|  >>> for n in s.findall (‘.//’ + OOo_Tag (‘style’, ‘style’, m)) :
|  …     if n.get (OOo_Tag (‘style’, ‘name’, m)).startswith (‘Co’) :
|  …         attrs = ‘name’, ‘class’, ‘family’
|  …         attrs = [n.get (OOo_Tag (‘style’, i, m)) for i in attrs]
|  …         print attrs
|  …         props = n.find (‘./’ + OOo_Tag (‘style’, ‘properties’, m))
|  …         if props is not None and len (props) :
|  …             props [0].tag
|  [‘Concat_Standard’, ‘text’, ‘paragraph’]
|  ‘{http://openoffice.org/2000/style}tab-stops’
|  [‘Concat_Text body’, ‘text’, ‘paragraph’]
|  [‘Concat_List’, ‘list’, ‘paragraph’]
|  [‘Concat_Caption’, ‘extra’, ‘paragraph’]
|  [‘Concat_Frame contents’, ‘extra’, ‘paragraph’]
|  [‘Concat_Index’, ‘index’, ‘paragraph’]
|  [‘Concat_Frame’, None, ‘graphics’]
|  [‘Concat_OLE’, None, ‘graphics’]
|  >>> for n in c.findall (‘.//*’) :
|  …     zidx = n.get (OOo_Tag (‘draw’, ‘z-index’, m))
|  …     if zidx :
|  …         print ‘:’.join(split_tag (n.tag)), zidx
|  draw:text-box 0
|  draw:rect 1
|  draw:text-box 3
|  draw:rect 4
|  draw:text-box 6
|  draw:text-box 7
|  draw:text-box 8
|  draw:text-box 9
|  draw:text-box 10
|  draw:text-box 11
|  draw:text-box 12
|  draw:text-box 13
|  draw:text-box 14
|  draw:text-box 15
|  draw:text-box 16
|  draw:text-box 18
|  draw:text-box 19
|  draw:text-box 20
|  draw:text-box 17
|  draw:text-box 23
|  draw:line 24
|  draw:text-box 2
|  draw:text-box 5
|  draw:line 22
|  draw:line 21
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘carta.stw’, outfile = sio)
|  >>> t = Transformer (
|  …     o.mimetype
|  …   , get_meta (o.mimetype)
|  …   , Transforms.Addpagebreak_Style ()
|  …   , Transforms.Mailmerge
|  …     ( iterator =
|  …         ( dict
|  …             ( Spett = “Spettabile”
|  …             , contraente = “First person”
|  …             , indirizzo = “street? 1”
|  …             , tipo = “racc. A.C.”
|  …             , luogo = “Varese”
|  …             , oggetto = “Saluti”
|  …             )
|  …         , dict
|  …             ( Spett = “Egregio”
|  …             , contraente = “Second Person”
|  …             , indirizzo = “street? 2”
|  …             , tipo = “Raccomandata”
|  …             , luogo = “Gavirate”
|  …             , oggetto = “Ossequi”
|  …             )
|  …         )
|  …     )
|  …   , renumber_all (o.mimetype)
|  …   , set_meta (o.mimetype)
|  …   , Transforms.Fix_OOo_Tag ()
|  …   )
|  >>> t.transform(o)
|  >>> o.close()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“carta-out.stw”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, mimetype = m)
|  >>> for node in body.findall (vset) :
|  …     name = node.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name, ‘:’, node.text
|  Spett : Spettabile
|  contraente : First person
|  indirizzo : street? 1
|  Spett : Egregio
|  contraente : Second Person
|  indirizzo : street? 2
|  tipo : racc. A.C.
|  luogo : Varese
|  oggetto : Saluti
|  tipo : Raccomandata
|  luogo : Gavirate
|  oggetto : Ossequi
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘test.odt’, outfile = sio)
|  >>> t   = Transformer (
|  …       o.mimetype
|  …     , get_meta (o.mimetype)
|  …     , Transforms.Addpagebreak_Style ()
|  …     , Transforms.Mailmerge
|  …       ( iterator =
|  …         ( dict (firstname = ‘Erika’, lastname = ‘Nobody’)
|  …         , dict (firstname = ‘Eric’,  lastname = ‘Wizard’)
|  …         , cb
|  …         )
|  …       )
|  …     , renumber_all (o.mimetype)
|  …     , set_meta (o.mimetype)
|  …     , Transforms.Fix_OOo_Tag ()
|  …     )
|  >>> t.transform (o)
|  >>> for i in meta_counts :
|  …     print i, t [‘:’.join ((‘Set_Attribute’, i))]
|  character-count 951
|  image-count 0
|  object-count 0
|  page-count 3
|  paragraph-count 53
|  table-count 3
|  word-count 162
|  >>> name = t [‘Addpagebreak_Style:stylename’]
|  >>> name
|  ‘P2’
|  >>> o.close ()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“testout.odt”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, m))
|  >>> for n in body.findall (‘.//*’) :
|  …     zidx = n.get (OOo_Tag (‘draw’, ‘z-index’, m))
|  …     if zidx :
|  …         print ‘:’.join(split_tag (n.tag)), zidx
|  draw:frame 0
|  draw:rect 1
|  draw:frame 3
|  draw:rect 4
|  draw:frame 6
|  draw:rect 7
|  draw:frame 2
|  draw:frame 5
|  draw:frame 8
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘text’, ‘p’, m)) :
|  …     if n.get (OOo_Tag (‘text’, ‘style-name’, m)) == name :
|  …         print n.tag
|  {urn:oasis:names:tc:opendocument:xmlns:text:1.0}p
|  {urn:oasis:names:tc:opendocument:xmlns:text:1.0}p
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, m)
|  >>> for n in body.findall (vset) :
|  …     if n.get (OOo_Tag (‘text’, ‘name’, m), None).endswith (‘name’) :
|  …         name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …         print name, ‘:’, n.text
|  firstname : Erika
|  lastname : Nobody
|  firstname : Eric
|  lastname : Wizard
|  firstname : Hugo
|  lastname : Testman
|  firstname : Erika
|  lastname : Nobody
|  firstname : Eric
|  lastname : Wizard
|  firstname : Hugo
|  lastname : Testman
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘draw’, ‘frame’, m)) :
|  …     print n.get (OOo_Tag (‘draw’, ‘name’, m)),
|  …     print n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m))
|  Frame1 1
|  Frame2 2
|  Frame3 3
|  Frame4 None
|  Frame5 None
|  Frame6 None
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘text’, ‘section’, m)) :
|  …     print n.get (OOo_Tag (‘text’, ‘name’, m))
|  Section1
|  Section2
|  Section3
|  Section4
|  Section5
|  Section6
|  Section7
|  Section8
|  Section9
|  Section10
|  Section11
|  Section12
|  Section13
|  Section14
|  Section15
|  Section16
|  Section17
|  Section18
|  >>> for n in body.findall (‘.//’ + OOo_Tag (‘table’, ‘table’, m)) :
|  …     print n.get (OOo_Tag (‘table’, ‘name’, m))
|  Table1
|  Table2
|  Table3
|  >>> r = o.read (‘meta.xml’)
|  >>> meta = r.find (‘.//’ + OOo_Tag (‘meta’, ‘document-statistic’, m))
|  >>> for i in meta_counts :
|  …     print i, repr (meta.get (OOo_Tag (‘meta’, i, m)))
|  character-count ‘951’
|  image-count ‘0’
|  object-count ‘0’
|  page-count ‘3’
|  paragraph-count ’53’
|  table-count ‘3’
|  word-count ‘162’
|  >>> o.close ()
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘carta.odt’, outfile = sio)
|  >>> t = Transformer (
|  …     o.mimetype
|  …   , get_meta (o.mimetype)
|  …   , Transforms.Addpagebreak_Style ()
|  …   , Transforms.Mailmerge
|  …     ( iterator =
|  …         ( dict
|  …             ( Spett = “Spettabile”
|  …             , contraente = “First person”
|  …             , indirizzo = “street? 1”
|  …             , tipo = “racc. A.C.”
|  …             , luogo = “Varese”
|  …             , oggetto = “Saluti”
|  …             )
|  …         , dict
|  …             ( Spett = “Egregio”
|  …             , contraente = “Second Person”
|  …             , indirizzo = “street? 2”
|  …             , tipo = “Raccomandata”
|  …             , luogo = “Gavirate”
|  …             , oggetto = “Ossequi”
|  …             )
|  …         )
|  …     )
|  …   , renumber_all (o.mimetype)
|  …   , set_meta (o.mimetype)
|  …   , Transforms.Fix_OOo_Tag ()
|  …   )
|  >>> t.transform(o)
|  >>> o.close()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“carta-out.odt”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, mimetype = m)
|  >>> for node in body.findall (vset) :
|  …     name = node.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name, ‘:’, node.text
|  Spett : Spettabile
|  contraente : First person
|  indirizzo : street? 1
|  Spett : Egregio
|  contraente : Second Person
|  indirizzo : street? 2
|  tipo : racc. A.C.
|  luogo : Varese
|  oggetto : Saluti
|  tipo : Raccomandata
|  luogo : Gavirate
|  oggetto : Ossequi
|  >>> sio = StringIO ()
|  >>> o   = OOoPy (infile = ‘test.odt’, outfile = sio)
|  >>> t   = Transformer (
|  …       o.mimetype
|  …     , get_meta (o.mimetype)
|  …     , Transforms.Concatenate (‘test.odt’, ‘rechng.odt’)
|  …     , renumber_all (o.mimetype)
|  …     , set_meta (o.mimetype)
|  …     , Transforms.Fix_OOo_Tag ()
|  …     )
|  >>> t.transform (o)
|  >>> for i in meta_counts :
|  …     print i, repr (t [‘:’.join ((‘Set_Attribute’, i))])
|  character-count ‘1131’
|  image-count ‘0’
|  object-count ‘0’
|  page-count ‘3’
|  paragraph-count ’80’
|  table-count ‘2’
|  word-count ‘159’
|  >>> o.close ()
|  >>> ov  = sio.getvalue ()
|  >>> f   = open (“testout3.odt”, “wb”)
|  >>> f.write (ov)
|  >>> f.close ()
|  >>> o = OOoPy (infile = sio)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> s = o.read (‘styles.xml’)
|  >>> for n in c.findall (‘./*/*’) :
|  …     name = n.get (OOo_Tag (‘style’, ‘name’, m))
|  …     if name :
|  …         parent = n.get (OOo_Tag (‘style’, ‘parent-style-name’, m))
|  …         print ‘”%s”, “%s”‘ % (name, parent)
|  “Tahoma1”, “None”
|  “Bitstream Vera Sans”, “None”
|  “Tahoma”, “None”
|  “Nimbus Roman No9 L”, “None”
|  “Courier New”, “None”
|  “Arial Black”, “None”
|  “New Century Schoolbook”, “None”
|  “Times New Roman”, “None”
|  “Arial”, “None”
|  “Helvetica”, “None”
|  “Table1”, “None”
|  “Table1.A”, “None”
|  “Table1.A1”, “None”
|  “Table1.E1”, “None”
|  “Table1.A2”, “None”
|  “Table1.E2”, “None”
|  “P1”, “None”
|  “fr1”, “Frame”
|  “fr2”, “Frame”
|  “Sect1”, “None”
|  “gr1”, “None”
|  “P2”, “Standard”
|  “Standard_Concat”, “None”
|  “Concat_P1”, “Concat_Frame_20_contents”
|  “Concat_P2”, “Concat_Frame_20_contents”
|  “P3”, “Concat_Frame_20_contents”
|  “P4”, “Concat_Standard”
|  “P5”, “Concat_Standard”
|  “P6”, “Concat_Frame_20_contents”
|  “P7”, “Concat_Frame_20_contents”
|  “P8”, “Concat_Frame_20_contents”
|  “P9”, “Concat_Frame_20_contents”
|  “P10”, “Concat_Frame_20_contents”
|  “P11”, “Concat_Frame_20_contents”
|  “P12”, “Concat_Frame_20_contents”
|  “P14”, “Concat_Standard”
|  “P15”, “Concat_Standard”
|  “P16”, “Concat_Standard”
|  “P17”, “Concat_Standard”
|  “P18”, “Concat_Standard”
|  “P19”, “Concat_Standard”
|  “P20”, “Concat_Standard”
|  “P21”, “Concat_Standard”
|  “P22”, “Concat_Standard”
|  “P23”, “Concat_Standard”
|  “Concat_fr1”, “Frame”
|  “Concat_fr2”, “Frame”
|  “fr3”, “Frame”
|  “fr4”, “Frame”
|  “fr5”, “Frame”
|  “fr6”, “Frame”
|  “Concat_gr1”, “None”
|  “N0”, “None”
|  “N2”, “None”
|  “P14_Concat”, “Concat_Standard”
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘variable-decl’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name
|  salutation
|  firstname
|  lastname
|  street
|  country
|  postalcode
|  city
|  date
|  invoice.invoice_no
|  invoice.abo.aboprice.abotype.description
|  address.salutation
|  address.title
|  address.firstname
|  address.lastname
|  address.function
|  address.street
|  address.country
|  address.postalcode
|  address.city
|  invoice.subscriber.salutation
|  invoice.subscriber.title
|  invoice.subscriber.firstname
|  invoice.subscriber.lastname
|  invoice.subscriber.function
|  invoice.subscriber.street
|  invoice.subscriber.country
|  invoice.subscriber.postalcode
|  invoice.subscriber.city
|  invoice.period_start
|  invoice.period_end
|  invoice.currency.name
|  invoice.amount
|  invoice.subscriber.initial
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘sequence-decl’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name
|  Illustration
|  Table
|  Text
|  Drawing
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘p’, m)) :
|  …     name = n.get (OOo_Tag (‘text’, ‘style-name’, m))
|  …     if not name or name.startswith (‘Concat’) :
|  …         print ‘:’.join(split_tag (n.tag)), “>%s<” % name
|  text:p >None<
|  text:p >None<
|  text:p >Concat_P1<
|  text:p >Concat_P1<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_P2<
|  text:p >Concat_Frame_20_contents<
|  text:p >None<
|  text:p >None<
|  text:p >None<
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘frame’, m)) :
|  …     attrs = ‘name’, ‘style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     attrs.append (n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m)))
|  …     print attrs
|  [‘Frame1’, ‘fr1’, ‘0’, ‘1’]
|  [‘Frame2’, ‘fr1’, ‘3’, ‘2’]
|  [‘Frame3’, ‘Concat_fr1’, ‘6’, ‘3’]
|  [‘Frame4’, ‘Concat_fr2’, ‘7’, ‘3’]
|  [‘Frame5’, ‘fr3’, ‘8’, ‘3’]
|  [‘Frame6’, ‘Concat_fr1’, ‘9’, ‘3’]
|  [‘Frame7’, ‘fr4′, ’10’, ‘3’]
|  [‘Frame8’, ‘fr4′, ’11’, ‘3’]
|  [‘Frame9’, ‘fr4′, ’12’, ‘3’]
|  [‘Frame10’, ‘fr4′, ’13’, ‘3’]
|  [‘Frame11’, ‘fr4′, ’14’, ‘3’]
|  [‘Frame12’, ‘fr4′, ’15’, ‘3’]
|  [‘Frame13’, ‘fr5′, ’16’, ‘3’]
|  [‘Frame14’, ‘fr4′, ’18’, ‘3’]
|  [‘Frame15’, ‘fr4′, ’19’, ‘3’]
|  [‘Frame16’, ‘fr4′, ’20’, ‘3’]
|  [‘Frame17’, ‘fr6′, ’17’, ‘3’]
|  [‘Frame18’, ‘fr4′, ’23’, ‘3’]
|  [‘Frame19’, ‘fr2’, ‘2’, None]
|  [‘Frame20’, ‘fr2’, ‘5’, None]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘text’, ‘section’, m)) :
|  …     attrs = ‘name’, ‘style-name’
|  …     attrs = [n.get (OOo_Tag (‘text’, i, m)) for i in attrs]
|  …     print attrs
|  [‘Section1’, ‘Sect1’]
|  [‘Section2’, ‘Sect1’]
|  [‘Section3’, ‘Sect1’]
|  [‘Section4’, ‘Sect1’]
|  [‘Section5’, ‘Sect1’]
|  [‘Section6’, ‘Sect1’]
|  [‘Section7’, ‘Sect1’]
|  [‘Section8’, ‘Sect1’]
|  [‘Section9’, ‘Sect1’]
|  [‘Section10’, ‘Sect1’]
|  [‘Section11’, ‘Sect1’]
|  [‘Section12’, ‘Sect1’]
|  [‘Section13’, ‘Sect1’]
|  [‘Section14’, ‘Sect1’]
|  [‘Section15’, ‘Sect1’]
|  [‘Section16’, ‘Sect1’]
|  [‘Section17’, ‘Sect1’]
|  [‘Section18’, ‘Sect1’]
|  [‘Section19’, ‘Sect1’]
|  [‘Section20’, ‘Sect1’]
|  [‘Section21’, ‘Sect1’]
|  [‘Section22’, ‘Sect1’]
|  [‘Section23’, ‘Sect1’]
|  [‘Section24’, ‘Sect1’]
|  [‘Section25’, ‘Sect1’]
|  [‘Section26’, ‘Sect1’]
|  [‘Section27’, ‘Sect1’]
|  [‘Section28’, ‘Sect1’]
|  [‘Section29’, ‘Sect1’]
|  [‘Section30’, ‘Sect1’]
|  [‘Section31’, ‘Sect1’]
|  [‘Section32’, ‘Sect1’]
|  [‘Section33’, ‘Sect1’]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘rect’, m)) :
|  …     attrs = ‘style-name’, ‘text-style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     attrs.append (n.get (OOo_Tag (‘text’, ‘anchor-page-number’, m)))
|  …     print attrs
|  [‘gr1’, ‘P1’, ‘1’, ‘1’]
|  [‘gr1’, ‘P1’, ‘4’, ‘2’]
|  >>> for n in c.findall (‘.//’ + OOo_Tag (‘draw’, ‘line’, m)) :
|  …     attrs = ‘style-name’, ‘text-style-name’, ‘z-index’
|  …     attrs = [n.get (OOo_Tag (‘draw’, i, m)) for i in attrs]
|  …     print attrs
|  [‘Concat_gr1’, ‘P1′, ’24’]
|  [‘Concat_gr1’, ‘P1′, ’22’]
|  [‘Concat_gr1’, ‘P1′, ’21’]
|  >>> for n in s.findall (‘.//’ + OOo_Tag (‘style’, ‘style’, m)) :
|  …     if n.get (OOo_Tag (‘style’, ‘name’, m)).startswith (‘Co’) :
|  …         attrs = ‘name’, ‘display-name’, ‘class’, ‘family’
|  …         attrs = [n.get (OOo_Tag (‘style’, i, m)) for i in attrs]
|  …         print attrs
|  …         props = n.find (‘./’ + OOo_Tag (‘style’, ‘properties’, m))
|  …         if props is not None and len (props) :
|  …             props [0].tag
|  [‘Concat_Standard’, None, ‘text’, ‘paragraph’]
|  [‘Concat_Text_20_body’, ‘Concat Text body’, ‘text’, ‘paragraph’]
|  [‘Concat_List’, None, ‘list’, ‘paragraph’]
|  [‘Concat_Caption’, None, ‘extra’, ‘paragraph’]
|  [‘Concat_Frame_20_contents’, ‘Concat Frame contents’, ‘extra’, ‘paragraph’]
|  [‘Concat_Index’, None, ‘index’, ‘paragraph’]
|  >>> for n in c.findall (‘.//*’) :
|  …     zidx = n.get (OOo_Tag (‘draw’, ‘z-index’, m))
|  …     if zidx :
|  …         print ‘:’.join(split_tag (n.tag)), zidx
|  draw:frame 0
|  draw:rect 1
|  draw:frame 3
|  draw:rect 4
|  draw:frame 6
|  draw:frame 7
|  draw:frame 8
|  draw:frame 9
|  draw:frame 10
|  draw:frame 11
|  draw:frame 12
|  draw:frame 13
|  draw:frame 14
|  draw:frame 15
|  draw:frame 16
|  draw:frame 18
|  draw:frame 19
|  draw:frame 20
|  draw:frame 17
|  draw:frame 23
|  draw:line 24
|  draw:frame 2
|  draw:frame 5
|  draw:line 22
|  draw:line 21
|  >>> from os import system
|  >>> system (‘python ./ooo_fieldreplace -i test.odt -o testout.odt ‘
|  …         ‘salutation=Frau firstname=Erika lastname=Musterfrau ‘
|  …         ‘country=D postalcode=00815 city=Niemandsdorf ‘
|  …         ‘street=”Beispielstrasse 42″‘)
|  0
|  >>> o = OOoPy (infile = ‘testout.odt’)
|  >>> c = o.read (‘content.xml’)
|  >>> m = o.mimetype
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, mimetype = m)
|  >>> for node in body.findall (vset) :
|  …     name = node.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name, ‘:’, node.text
|  salutation : Frau
|  firstname : Erika
|  lastname : Musterfrau
|  street : Beispielstrasse 42
|  country : D
|  postalcode : 00815
|  city : Niemandsdorf
|  salutation : Frau
|  firstname : Erika
|  lastname : Musterfrau
|  street : Beispielstrasse 42
|  country : D
|  postalcode : 00815
|  city : Niemandsdorf
|  >>> o.close ()
|  >>> system (“./ooo_mailmerge -o testout.odt -d, carta.odt x.csv”)
|  0
|  >>> o = OOoPy (infile = ‘testout.odt’)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> vset = ‘.//’ + OOo_Tag (‘text’, ‘variable-set’, mimetype = m)
|  >>> for node in body.findall (vset) :
|  …     name = node.get (OOo_Tag (‘text’, ‘name’, m))
|  …     print name, ‘:’, node.text
|  Spett : Spettabile
|  contraente : First person
|  indirizzo : street? 1
|  Spett : Egregio
|  contraente : Second Person
|  indirizzo : street? 2
|  tipo : racc. A.C.
|  luogo : Varese
|  oggetto : Saluti
|  tipo : Raccomandata
|  luogo : Gavirate
|  oggetto : Ossequi
|  >>> o.close ()
|  >>> o   = OOoPy (infile = ‘testenum.odt’, outfile = ‘xyzzy.odt’)
|  >>> t   = Transformer (
|  …       o.mimetype
|  …     , get_meta (o.mimetype)
|  …     , Transforms.Addpagebreak_Style ()
|  …     , Transforms.Mailmerge
|  …       ( iterator =
|  …         ( dict (firstname = ‘Erika’, lastname = ‘Nobody’)
|  …         , dict (firstname = ‘Eric’,  lastname = ‘Wizard’)
|  …         , cb
|  …         )
|  …       )
|  …     , renumber_all (o.mimetype)
|  …     , set_meta (o.mimetype)
|  …     , Transforms.Fix_OOo_Tag ()
|  …     )
|  >>> t.transform (o)
|  >>> o.close ()
|  >>> o = OOoPy (infile = ‘xyzzy.odt’)
|  >>> m = o.mimetype
|  >>> c = o.read (‘content.xml’)
|  >>> body = c.find (OOo_Tag (‘office’, ‘body’, mimetype = m))
|  >>> textlist = ‘.//’ + OOo_Tag (‘text’, ‘list’, m)
|  >>> for node in body.findall (textlist) :
|  …     id = node.get (OOo_Tag (‘xml’, ‘id’, m))
|  …     print ‘xml:id’, ‘:’, id
|  xml:id : list1
|  xml:id : list2
|  xml:id : list3
|
|  Method resolution order:
|      Transformer
|      ooopy.OOoPy.autosuper
|      __builtin__.object
|
|  Methods defined here:
|
|  __getitem__(self, key)
|
|  __init__(self, mimetype, *tf)
|
|  __setitem__(self, key, value)
|
|  insert(self, transform)
|      Insert a new transform
|
|  transform(self, ooopy)
|      Apply all the transforms in priority order.
|      Priority order is global over all transforms.
|
|  ———————————————————————-
|  Data descriptors inherited from ooopy.OOoPy.autosuper:
|
|  __dict__
|      dictionary for instance variables (if defined)
|
|  __weakref__
|      list of weak references to the object (if defined)
|
|  ———————————————————————-
|  Data and other attributes inherited from ooopy.OOoPy.autosuper:
|
|  __metaclass__ = <class ‘ooopy.OOoPy._autosuper’>

 

José Antonio Meira da Rocha

Jornalista, professor das áreas de Editoração e de Mídias Digitais na Universidade Federal de Santa Maria, campus cidade de Frederico Westphalen, Rio Grande do Sul, Brasil. Doutor em Design pelo Programa de Pós-Graduação em Design (PGDesign)/Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brasil, 2023. Mestre em Mídias pela UNISINOS, São Leopoldo, RS, Brasil, 2003. Especialista em Informática na Educação, Unisinos, 1976.

Share
Published by
José Antonio Meira da Rocha

Recent Posts

Sempre faça um fotão

Colheita de soja. Foto: Wenderson Araujo/Trilux Fotógrafos de mídias rurais já perderam a conta das…

1 year ago

A corrupção dos tolos

João Batista MezzomoAuditor fiscal O que está por trás de tudo o que está acontecendo…

4 years ago

Naomi who? Naomi Wu!

A.k.a. "SexyCyborg". A mulher do século 21. Naomi Wu testa seu iluminador de implantes na…

5 years ago

Raspagem de dados

A principal ferramenta do jornalista de dados é a planilha, tipo LibreOffice Calc, M.S. Excel…

5 years ago

Que estratégia político-terapêutica pára um governo deliroide?

Rita Almeida, 9 de março de 2019 Psicóloga Rita Almeida: não delirantes, mas deliroides. Não…

6 years ago

Sua tia não é fascista, ela está sendo manipulada

Rafael Azzi5 de outubro de 2018 Você se pergunta como um candidato com tão poucas…

6 years ago