Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at SDU@AAAI-22
Download and install a git client and clone this repository:
git clone git@github.com:DS3Lab/TableParser.git
into <git-home>
directory. (home directory is denoted as git-home furtheron).
-
System overview of the TableParser pipeline
This browser does not support PDFs. Please download the PDF to view it: the TableParser pipeline.
-
Model overview of Mask RCNN in DocParser
This browser does not support PDFs. Please download the PDF to view it: Mask-RCNN.
- TableAnnotator: refer to this repo.
- Demo of annotating a table using TableAnnotator
- ExcelAnnotator:
./ExcelAnnotator
. - TableParser pipelines:
./TableParser
. - Data: Download from this Google Drive link.
- TableParser M1 (ModernTableParser) and M2 (HistoricalTableParser) can be downloaded from this Google Drive link, and put under
./TableParser/TableParser/detectron2/tools/docparser_outputs
.
To cite TableParser, refer to these items:
@inproceedings{rausch2021docparser, title={DocParser: Hierarchical Document Structure Parsing from Renderings}, author={Rausch, Johannes and Martinez, Octavio and Bissig, Fabian and Zhang, Ce and Feuerriegel, Stefan}, booktitle={35th AAAI Conference on Artificial Intelligence (AAAI-21)(virtual)}, howpublished = {\url{https://arxiv.org/abs/1911.01702}}, year={2021} }
@inproceedings{rao2022tableparser, title={TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets}, author={Rao, Susie Xi and Rausch, Johannes and Egger, Peter and Zhang, Ce}, booktitle={Scientific Document Understanding Workshop (SDU{@}AAAI-22)(virtual)}, howpublished = {\url{https://arxiv.org/abs/2201.01654}}, year={2022} }
- TableAnnotator: refer to this repo.