[diffoscope] [PATCH] Implement ElfFile as a container of sections

Dhole dhole at openmailbox.org
Fri Dec 25 17:43:14 CET 2015


On 12/24/2015 04:51 PM, Jérémy Bobbio wrote:
>> If there is any issue, comment or improvement let me know so that I can
>> address it :)
> 
> Yes, the code could be made nicer and more extensible quite easily.
> 
>> +_string_sections = ['.interp', '.dynstr', '.comment', '.debug_str']
>> +
>> +class ElfContainer(Container):
>> +    def __init__(self, *args, **kwargs):
>> +        super().__init__(*args, **kwargs)
>> +        logger.debug('Creating ElfContainer %s', self.source.path)
>> +        cmd = ['objdump', '--section-headers', self.source.path]
> 
> I think using `readelf --section-headers` would be better suited,
> because you get a TYPE column. Then you could have a dictionnary from
> TYPE to Python class. It would replace the `_string_sections` list and
> that if series:
> 
>> +            if 'CODE' in section_type:
>> +                self._sections[section_name] = ElfCodeSection
>> +                self._section_list.append(section_name)
>> +                logger.debug('Adding CODE section %s', section_name)
>> +            elif section_name in _string_sections:
>> +                self._sections[section_name] = ElfStringSection
>> +                self._section_list.append(section_name)
>> +                logger.debug('Adding String section %s', section_name)
>> +            elif 'CONTENTS' in section_type:
>> +                self._sections[section_name] = ElfSection
>> +                self._section_list.append(section_name)
>> +                logger.debug('Adding CONTENT section %s', section_name)
>> +            else:
>> +                # Some sections are empty. Eg.: .bss
>> +                pass
> 
> When you start cascading ifs like that, it's a good thing to always ask
> yourself: could I replace them by a dispatch table?

I agree that having a dispatch table would be much nicer. Unfortunately
(correct me if I'm missing something) I believe that the column type
from the `readelf --section-headers` output is not enough to know how to
deal with each section. For example, .text has code (to be disassembled)
.rodata has bytes (to be shown as a hex dump) and .comment has strings
(to be searched for strings to display), but they all show the same
type: PROGBITS:

>   [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
>   [14] .text             PROGBITS        0000000000408dc0 008dc0 070902 00  AX  0   0 16
>   [16] .rodata           PROGBITS        00000000004796d0 0796d0 022d18 00   A  0   0 16
>   [28] .comment          PROGBITS        0000000000000000 0b29a4 000086 01  MS  0   0  1

I'll see if I can find a better way to parse the output from `objdump
--section-headers`, or see if there is a way to get the proper type at once.

> Another thing: as soon as we start interpreting output like that,
> there's a chance we might be given things not exactly like we want them.
> Could you make sure that diffoscope will not crash (but rather fallback
> to binary diff for example) in case we can't parse the output of
> readelf?

Sure! I'll loot into that :)

-- 
Dhole

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <http://lists.reproducible-builds.org/pipermail/diffoscope/attachments/20151225/d9bda227/attachment.sig>


More information about the diffoscope mailing list