Projekt

Allgemein

Profil

« Zurück | Weiter » 

Revision 413addb3

Von Sven Schöling vor etwa 8 Jahren hinzugefügt

  • ID 413addb38b69fde3dbac2e5861dc651dc8ae878c
  • Vorgänger e1f93c18
  • Nachfolger 821d5e34

GDPDU: Doku Update

Unterschiede anzeigen:

SL/GDPDU.pm
594 594

  
595 595
=item location
596 596

  
597
Location of the company, needed for the suupplier header
597
Location of the company, needed for the supplier header
598 598

  
599 599
=item from
600 600

  
......
605 605

  
606 606
=item tables
607 607

  
608
A list of tables to be exported.
608
Ooptional list of tables to be exported. Defaults to all tables.
609 609

  
610 610
=item all_tables
611 611

  
612
Alternative to C<tables>, enables all known tables.
612
Optional alternative to C<tables>, forces all known tables.
613 613

  
614 614
=back
615 615

  
616 616
=item C<generate_export>
617 617

  
618
Do the work. Will return an absolut path to a temp file where all export files
618
Do the work. Will return an absolute path to a temp file where all export files
619 619
are zipped together.
620 620

  
621 621
=back
622 622

  
623 623
=head1 CAVEATS
624 624

  
625
Sigh. There are a lot of issues with the IDEA software that were found out by
626
trial and error.
627

  
628
=head2 Problems in the Specification
629

  
625 630
=over 4
626 631

  
627 632
=item *
628 633

  
629
Date format is shit. The official docs state that only C<YY>, C<YYYY>, C<MM>,
630
and C<DD> are supported, timestamps do not exist.
634
The specced date format is capable of only C<YY>, C<YYYY>, C<MM>,
635
and C<DD>. There are no timestamps or timezones.
631 636

  
632 637
=item *
633 638

  
634
Number parsing seems to be fragile. Official docs state that behaviour for too
635
low C<Accuracy> settings is undefined. Accuracy of 0 is not taken to mean
636
Integer but instead generates a warning for redudancy.
639
Numbers have the same issue. There is not dedicated integer type, and hinting
640
at an integer type by setting accuracy to 0 generates a warning for redundant
641
accuracy.
637 642

  
638
There is no dedicated integer type.
643
Also the number parsing is documented to be fragile. Official docs state that
644
behaviour for too low C<Accuracy> settings is undefined.
639 645

  
640 646
=item *
641 647

  
642
Currently C<ar> and C<ap> have a foreign key to themself with the name
643
C<storno_id>. If this foreign key is present in the C<INDEX.XML> then the
644
storno records have to be too. Since this is extremely awkward to code and
645
confusing for the examiner as to why there are records outside of the time
646
range, this export skips all self-referential foreign keys.
647

  
648
=item *
648
Foreign key definition is broken. Instead of giving column maps it assumes that
649
foreign keys map to the primary keys given for the target table, and in that
650
order. Also the target table must be known in full before defining a foreign key.
649 651

  
650
Documentation for foreign keys is extremely weird. Instead of giving column
651
maps it assumes that foreign keys map to the primary keys given for the target
652
table, and in that order. Foreign keys to keys that are not primary seems to be
653
impossible. Changing type is also not allowed (which actually makes sense).
654
Hopefully there are no bugs there.
652
As a consequence any additional keys apart from primary keys are not possible.
653
Self-referencing tables are also not possible.
655 654

  
656 655
=item *
657 656

  
658
It's currently disallowed to export the whole dataset. It's not clear if this
659
is wanted.
657
The spec does not support splitting data sets into smaller chunks. For data
658
sets that exceed 700MB the spec helpfully suggests: "Use a bigger medium, such
659
as a DVD".
660 660

  
661 661
=item *
662 662

  
663
It is not possible to set an empty C<DigiGroupingSymbol> since then the import
663
It is not possible to set an empty C<DigitGroupingSymbol> since then the import
664 664
will just work with the default. This was asked in their forum, and the
665
response actually was:
665
response actually was to use a bogus grouping symbol that is not used:
666 666

  
667 667
  Einfache Lösung: Definieren Sie das Tausendertrennzeichen als Komma, auch
668 668
  wenn es nicht verwendet wird. Sollten Sie das Komma bereits als Feldtrenner
......
681 681

  
682 682
Instead we just use the implicit default RecordDelimiter CRLF.
683 683

  
684
=item *
684
=back
685 685

  
686
Not confirmed yet:
686
=head2 Bugs in the IDEA software
687 687

  
688
Foreign keys seem only to work with previously defined tables (which would be
689
utterly insane).
688
=over 4
690 689

  
691 690
=item *
692 691

  
......
699 698
Neither it is able to parse escaped C<ColumnDelimiter> in data. It just splits
700 699
on that symbol no matter what surrounds or preceeds it.
701 700

  
701
=back
702

  
703
=head2 Problems outside of the software
704

  
705
=over 4
706

  
707
=item *
708

  
709
The law states that "all business related data" should be made available. In
710
practice there's no definition for what makes data "business related", and
711
different auditors seems to want different data.
712

  
713
Currently we export most of the transactional data with supplementing
714
customers, vendors and chart of accounts.
715

  
716
=item *
717

  
718
While the standard explicitely state to provide data normalized, in practice
719
autditors aren't trained database operators and can not create complex vies on
720
normalized data on their own. The reason this works for other software is, that
721
DATEV and SAP seem to have written import plugins for their internal formats in
722
the IDEA software.
723

  
724
So what is really exported is not unlike a DATEV export. Each transaction gets
725
splitted into chunks of 2 positions (3 with tax on one side). Those get
726
denormalized into a single data row with credfit/debit/tax fields. The charts
727
get denormalized into it as well, in addition to their account number serving
728
as a foreign key.
729

  
730
Customers and vendors get denormalized into this as well, but are linked by ids
731
to their tables. And the reason for this is...
732

  
702 733
=item *
703 734

  
704
Fun fact: Some auditors do not have a full license of the IDEA software, and
705
can't do table joins. So it's best to provide denormalized data for them, so
706
that the auditor may infer which object is meant.
735
Some auditors do not have a full license of the IDEA software, and
736
can't do table joins.
707 737

  
708 738
=back
709 739

  

Auch abrufbar als: Unified diff