Revision 413addb3
Von Sven Schöling vor etwa 8 Jahren hinzugefügt
SL/GDPDU.pm | ||
---|---|---|
594 | 594 |
|
595 | 595 |
=item location |
596 | 596 |
|
597 |
Location of the company, needed for the suupplier header
|
|
597 |
Location of the company, needed for the supplier header |
|
598 | 598 |
|
599 | 599 |
=item from |
600 | 600 |
|
... | ... | |
605 | 605 |
|
606 | 606 |
=item tables |
607 | 607 |
|
608 |
A list of tables to be exported.
|
|
608 |
Ooptional list of tables to be exported. Defaults to all tables.
|
|
609 | 609 |
|
610 | 610 |
=item all_tables |
611 | 611 |
|
612 |
Alternative to C<tables>, enables all known tables.
|
|
612 |
Optional alternative to C<tables>, forces all known tables.
|
|
613 | 613 |
|
614 | 614 |
=back |
615 | 615 |
|
616 | 616 |
=item C<generate_export> |
617 | 617 |
|
618 |
Do the work. Will return an absolut path to a temp file where all export files |
|
618 |
Do the work. Will return an absolute path to a temp file where all export files
|
|
619 | 619 |
are zipped together. |
620 | 620 |
|
621 | 621 |
=back |
622 | 622 |
|
623 | 623 |
=head1 CAVEATS |
624 | 624 |
|
625 |
Sigh. There are a lot of issues with the IDEA software that were found out by |
|
626 |
trial and error. |
|
627 |
|
|
628 |
=head2 Problems in the Specification |
|
629 |
|
|
625 | 630 |
=over 4 |
626 | 631 |
|
627 | 632 |
=item * |
628 | 633 |
|
629 |
Date format is shit. The official docs state that only C<YY>, C<YYYY>, C<MM>,
|
|
630 |
and C<DD> are supported, timestamps do not exist.
|
|
634 |
The specced date format is capable of only C<YY>, C<YYYY>, C<MM>,
|
|
635 |
and C<DD>. There are no timestamps or timezones.
|
|
631 | 636 |
|
632 | 637 |
=item * |
633 | 638 |
|
634 |
Number parsing seems to be fragile. Official docs state that behaviour for too
|
|
635 |
low C<Accuracy> settings is undefined. Accuracy of 0 is not taken to mean
|
|
636 |
Integer but instead generates a warning for redudancy.
|
|
639 |
Numbers have the same issue. There is not dedicated integer type, and hinting
|
|
640 |
at an integer type by setting accuracy to 0 generates a warning for redundant
|
|
641 |
accuracy.
|
|
637 | 642 |
|
638 |
There is no dedicated integer type. |
|
643 |
Also the number parsing is documented to be fragile. Official docs state that |
|
644 |
behaviour for too low C<Accuracy> settings is undefined. |
|
639 | 645 |
|
640 | 646 |
=item * |
641 | 647 |
|
642 |
Currently C<ar> and C<ap> have a foreign key to themself with the name |
|
643 |
C<storno_id>. If this foreign key is present in the C<INDEX.XML> then the |
|
644 |
storno records have to be too. Since this is extremely awkward to code and |
|
645 |
confusing for the examiner as to why there are records outside of the time |
|
646 |
range, this export skips all self-referential foreign keys. |
|
647 |
|
|
648 |
=item * |
|
648 |
Foreign key definition is broken. Instead of giving column maps it assumes that |
|
649 |
foreign keys map to the primary keys given for the target table, and in that |
|
650 |
order. Also the target table must be known in full before defining a foreign key. |
|
649 | 651 |
|
650 |
Documentation for foreign keys is extremely weird. Instead of giving column |
|
651 |
maps it assumes that foreign keys map to the primary keys given for the target |
|
652 |
table, and in that order. Foreign keys to keys that are not primary seems to be |
|
653 |
impossible. Changing type is also not allowed (which actually makes sense). |
|
654 |
Hopefully there are no bugs there. |
|
652 |
As a consequence any additional keys apart from primary keys are not possible. |
|
653 |
Self-referencing tables are also not possible. |
|
655 | 654 |
|
656 | 655 |
=item * |
657 | 656 |
|
658 |
It's currently disallowed to export the whole dataset. It's not clear if this |
|
659 |
is wanted. |
|
657 |
The spec does not support splitting data sets into smaller chunks. For data |
|
658 |
sets that exceed 700MB the spec helpfully suggests: "Use a bigger medium, such |
|
659 |
as a DVD". |
|
660 | 660 |
|
661 | 661 |
=item * |
662 | 662 |
|
663 |
It is not possible to set an empty C<DigiGroupingSymbol> since then the import |
|
663 |
It is not possible to set an empty C<DigitGroupingSymbol> since then the import
|
|
664 | 664 |
will just work with the default. This was asked in their forum, and the |
665 |
response actually was: |
|
665 |
response actually was to use a bogus grouping symbol that is not used:
|
|
666 | 666 |
|
667 | 667 |
Einfache Lösung: Definieren Sie das Tausendertrennzeichen als Komma, auch |
668 | 668 |
wenn es nicht verwendet wird. Sollten Sie das Komma bereits als Feldtrenner |
... | ... | |
681 | 681 |
|
682 | 682 |
Instead we just use the implicit default RecordDelimiter CRLF. |
683 | 683 |
|
684 |
=item *
|
|
684 |
=back
|
|
685 | 685 |
|
686 |
Not confirmed yet:
|
|
686 |
=head2 Bugs in the IDEA software
|
|
687 | 687 |
|
688 |
Foreign keys seem only to work with previously defined tables (which would be |
|
689 |
utterly insane). |
|
688 |
=over 4 |
|
690 | 689 |
|
691 | 690 |
=item * |
692 | 691 |
|
... | ... | |
699 | 698 |
Neither it is able to parse escaped C<ColumnDelimiter> in data. It just splits |
700 | 699 |
on that symbol no matter what surrounds or preceeds it. |
701 | 700 |
|
701 |
=back |
|
702 |
|
|
703 |
=head2 Problems outside of the software |
|
704 |
|
|
705 |
=over 4 |
|
706 |
|
|
707 |
=item * |
|
708 |
|
|
709 |
The law states that "all business related data" should be made available. In |
|
710 |
practice there's no definition for what makes data "business related", and |
|
711 |
different auditors seems to want different data. |
|
712 |
|
|
713 |
Currently we export most of the transactional data with supplementing |
|
714 |
customers, vendors and chart of accounts. |
|
715 |
|
|
716 |
=item * |
|
717 |
|
|
718 |
While the standard explicitely state to provide data normalized, in practice |
|
719 |
autditors aren't trained database operators and can not create complex vies on |
|
720 |
normalized data on their own. The reason this works for other software is, that |
|
721 |
DATEV and SAP seem to have written import plugins for their internal formats in |
|
722 |
the IDEA software. |
|
723 |
|
|
724 |
So what is really exported is not unlike a DATEV export. Each transaction gets |
|
725 |
splitted into chunks of 2 positions (3 with tax on one side). Those get |
|
726 |
denormalized into a single data row with credfit/debit/tax fields. The charts |
|
727 |
get denormalized into it as well, in addition to their account number serving |
|
728 |
as a foreign key. |
|
729 |
|
|
730 |
Customers and vendors get denormalized into this as well, but are linked by ids |
|
731 |
to their tables. And the reason for this is... |
|
732 |
|
|
702 | 733 |
=item * |
703 | 734 |
|
704 |
Fun fact: Some auditors do not have a full license of the IDEA software, and |
|
705 |
can't do table joins. So it's best to provide denormalized data for them, so |
|
706 |
that the auditor may infer which object is meant. |
|
735 |
Some auditors do not have a full license of the IDEA software, and |
|
736 |
can't do table joins. |
|
707 | 737 |
|
708 | 738 |
=back |
709 | 739 |
|
Auch abrufbar als: Unified diff
GDPDU: Doku Update