News: 11 March 2016 - Forum Rules

Author Topic: Floating point number representation  (Read 6951 times)

RetroHelix

  • Full Member
  • ***
  • Posts: 148
    • View Profile
Floating point number representation
« on: November 28, 2011, 03:07:46 am »
Sorry, this is not even romhacking related but I dont know where else to ask.

I have binaryfile here (from around 1990) that is part of an old financial software. It just stores data such as name, price and so on for every customer who bought something. Everything is more or less written in plain text and can be easily read but not the values for the price of something.
I have two values that (I'm quite sure of) represent the bill. In real it goes as follows:
262.50
- 110.00
Sum 152.50

In the binary file it is:
Code: [Select]
0x00400389
0x00005C87

As you can see there are only 2 hex values since you can calculate the sum with two values.

Do you have an idea how the prices are stored in these hexvalues? I tried the IEEE 754 standard but its no use. Any other ideas?

PS: I ask this since I want to export the old costumerdata to a newer system and the old software will not run on actual pcs (not even in dosbox).
Sorry again but I really don't know another place to ask such questions.

Jorpho

  • Hero Member
  • *****
  • Posts: 5050
  • The cat screams with the voice of a man.
    • View Profile
Re: Floating point number representation
« Reply #1 on: November 28, 2011, 09:05:50 am »
PS: I ask this since I want to export the old costumerdata to a newer system and the old software will not run on actual pcs (not even in dosbox).
Have you tried Bochs?  PCem might be worth a try too.  And I think people might still occasionally run DOS under VMware or even QEMU.  But I would try Bochs first; it is much, much more accurate than DOSBox.  (Also, did you try using an actual disk image in DOSBox?  Certain emulation functions are not activated when directories are mounted.)

Anyway, my first instinct was to see if Python could do something like this, and the first thing that turned up was http://www.python-forum.org/pythonforum/viewtopic.php?f=1&t=13638 .  Is that how you followed the IEEE 754 standard?
This signature is an illusion and is a trap devised by Satan. Go ahead dauntlessly! Make rapid progres!

RetroHelix

  • Full Member
  • ***
  • Posts: 148
    • View Profile
Re: Floating point number representation
« Reply #2 on: November 28, 2011, 09:42:15 am »
I tried the pythonscript given in your link and several online converters but the ouptut is not the same as in the binary.

Bochs sounds good, looks a bit hard to configure but Ill give it a try. I have an image of the hdd that was made with acronis true image but Ill try to make an image with WinImage since bochs supports this. Ill try to use dosbox with an image too.

Thanks so far.

KaioShin

  • RHDN Patreon Supporter!
  • Hero Member
  • *****
  • Posts: 5699
    • View Profile
    • The Romhacking Aerie
Re: Floating point number representation
« Reply #3 on: November 28, 2011, 01:25:08 pm »
Are you sure these are the hex values that data corresponds to? Depending on the exponent used, it can get quite complicated to figure out the binary representation of a number. Can you reverse engineer the software used to store those numbers perhaps?
All my posts are merely personal opinions and not statements of fact, even if they are not explicitly prefixed by "In my opinion", "IMO", "I believe", or similar modifiers. By reading this disclaimer you agree to reply in spirit of these conditions.

RetroHelix

  • Full Member
  • ***
  • Posts: 148
    • View Profile
Re: Floating point number representation
« Reply #4 on: November 28, 2011, 02:40:00 pm »
Yes I'm quite sure these hexvalues correspond to the data I posted. I found out by eliminating, this is what is left and its the only file that is big enough to hold every data that needed to be saved. Other data is stored as plain text in this file, even numbers. The binary file is a kind ob database that hold every disposal that were made.

The software itself runs under DOS (on an AMBRA 486 machine) and is just a bunch of menus (4 or 5 programs resposible to start a "real" program...) with over 170 executables. As for reversing this, I think I'm not experienced enough to reverse such things with a disassembler apart from not knowing which file I have to operate on anyway. I cant run a file monitor on the old pc either so its a bit complicated.

I first thought there was a standard way to save floatingpoint numbers back in these days but it doesn't look like.

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Floating point number representation
« Reply #5 on: November 28, 2011, 02:49:31 pm »
Change the data in the program and see what changes in the file.
In the event of a firestorm, the salad bar will remain open.

RetroHelix

  • Full Member
  • ***
  • Posts: 148
    • View Profile
Re: Floating point number representation
« Reply #6 on: November 28, 2011, 03:06:36 pm »
Change the data in the program and see what changes in the file.
I already thought about that BUT I don't have the old pc at my place and its still in use every day. But that's a good idea I'll will do that when I have the chance to.

KaioShin

  • RHDN Patreon Supporter!
  • Hero Member
  • *****
  • Posts: 5699
    • View Profile
    • The Romhacking Aerie
Re: Floating point number representation
« Reply #7 on: November 28, 2011, 03:16:44 pm »
Check out the representation of some full numbers like 1, 2, 8, 10, 16, 100. Make sure to cover many possible base numbers so we can figure out which bits correspond to the exponent. Then take one of those numbers and add different decimal places to it so we get an idea about how the fraction is calculated. Keep in mind that accuracy of floating point numbers is quite fluid depending on the number, so there will be some numbers that look just off. The bigger the sample pool the easier to figure out.
All my posts are merely personal opinions and not statements of fact, even if they are not explicitly prefixed by "In my opinion", "IMO", "I believe", or similar modifiers. By reading this disclaimer you agree to reply in spirit of these conditions.

RedComet

  • Hero Member
  • *****
  • Posts: 3171
    • View Profile
    • Twilight Translations
Re: Floating point number representation
« Reply #8 on: November 28, 2011, 09:05:12 pm »
Can you disassemble the executable or otherwise look at the code that uses that data? If it isn't using  IEEE 754, it's likely that the floating point arithmetic is done in software (assuming that this isn't a program that runs on custom hardware). Doing floating point arithmetic with only integer operations is annoying, but not too hard to figure out.
Twilight Translations - More than just Dragonball Z. :P

Kiyoshi Aman

  • RHDN Patreon Supporter!
  • Hero Member
  • *****
  • Posts: 2262
  • Browncoat Captain
    • View Profile
    • Aerdan's Blog
Re: Floating point number representation
« Reply #9 on: November 28, 2011, 10:47:53 pm »
I doubt they're floating-point, unless the authors of the software are really incompetent; you don't use floating-point numbers, ever, for financial software. (Floating-point numbers tend to result in rounding errors because they can't precisely represent decimal values.)

I would suspect that there's a ratio you need to divide by in order to get the values you see in the program.

Jorpho

  • Hero Member
  • *****
  • Posts: 5050
  • The cat screams with the voice of a man.
    • View Profile
Re: Floating point number representation
« Reply #10 on: November 29, 2011, 12:30:33 am »
Bochs sounds good, looks a bit hard to configure but Ill give it a try. I have an image of the hdd that was made with acronis true image but Ill try to make an image with WinImage since bochs supports this. Ill try to use dosbox with an image too.
It's been a long time since I toyed with Bochs; the FreeDOS image at http://bochs.sourceforge.net/diskimages.html used to be good for a starting point.  I'm pretty sure it comes with a "bximage" utility that can be used to create HDD images from scratch, though.

If you don't want to use WinImage, "DiskExplorer" is a freeware alternative with most of the same features.
This signature is an illusion and is a trap devised by Satan. Go ahead dauntlessly! Make rapid progres!

Auryn

  • Hero Member
  • *****
  • Posts: 650
    • View Profile
Re: Floating point number representation
« Reply #11 on: November 29, 2011, 12:35:05 pm »
I didn't read all the post but if you have the old program and the database, why not try to setup a virtual pc and run the program?? Maybe you have some export module.

RetroHelix

  • Full Member
  • ***
  • Posts: 148
    • View Profile
Re: Floating point number representation
« Reply #12 on: December 08, 2011, 04:51:18 pm »
Thanks for the help so far.

I Tried Bochs and got an error about a missing dll when trying to run the software although every file was in place.
I Tried DOSBox and got the error "This is not a valid -name-of-software-here- application" for most of the executable files of the program. After running almost all exe's in dosbox (:banghead:) I found one that lets me access the consumer-data. I cant change the values in the menu but at least I can display a specific consumer and his last purchase.

Change the data in the program and see what changes in the file.
I went the other way around and changed some values inside the binary-file. This is what I got:
Spoiler:
Code: [Select]
hex dez dual

01 0,00
02 0,00
05 0,00
FF junk (random text written over the boundaries of the textbox)
99 junk (random text written over the boundaries of the textbox)
10 0,00
0a 0,00
63 0,00
64 junk (random text written over the boundaries of the textbox)
66 junk (random text written over the boundaries of the textbox)
24 0,00
0100 0,00
0101 0,00
0501 0,00
FF01 0,00
FF30 0,00
1030 0,00
1E10 0,00
1E86 39,50
1E85 19,75

0170 0,00 101110000
0179 0,00 101111001
017A 0,01 101111010
017B 0,02 101111011
017D 0,06 101111101
017E 0,13 101111110
017F 0,25 101111111
0180 0,50 110000000
0181 1,01 110000001
0182 2,02 110000010
0183 4,03 110000011
0184 8,06 110000100
0A84 8,63
0B84 8,69
0C84 8,75
0D84 8,81
0E84 8,88
1E84 9,88
2E84 10,88
3E84 11,88
4E84 12,88
5E84 13,88
6E84 14,88
7E84 15,88
8E79 -0,00 1000111001111001
8E7A -0,01 1000111001111010
8E84 -8,88 1000111010000100
9E84 -9,88
AE84 -10,88
3E84 11,88
3F84 11,94

82 2,00
83 4,00
84 8,00
85 16,00
86 32,00
87 64,00

One value is still 4 bytes in size, I just tried low values first!
Looking at the dualnumber representation of the values it looks like there is a pattern. But I realized that I'm not good with numbers at this point^^ Ill investigate this further tomorrow, just want to let this here in case somebody wants to take a look.

I doubt they're floating-point, unless the authors of the software are really incompetent; you don't use floating-point numbers, ever, for financial software. (Floating-point numbers tend to result in rounding errors because they can't precisely represent decimal values.)

I would suspect that there's a ratio you need to divide by in order to get the values you see in the program.
MMh, good point. I tried to multiply/devide by 100 to get rid of the decimalpoint or to readd it but it was no use.

EDIT:
I think I'm quite near to find out how it works :D

By toying around with every byte for the value displayed in the program I found out that an exponent is used to calculate the actual floating point number (see the end of the spoiler). Since I only understood that IEEE754 is using a system of an exponent of 2 and one bit for the algebraic sign I had a look at it again. I still don't get what a mantissa is though ^^. While investigating the IEE754 standard I converted some values from the program and noticed that 3 bytes are identical to the hexadecimal representation of the program.
Example, 46,70 DM (or any other currency, it doesn't matter) is:
Financing Software: CD CC 3A 86
IEEE-Standard: 42 3A CC CD

By seeing this and by changing some values and having a look at the dual number representation I found out that the bitflag for the arithmetic sign is stored somewhere else but in the first bit (it is the 16th bit read from right to left). So the 86 of booth different bytes (86 and 42) is not holding the bit for the arithmetical sign but is still of. How do I get a 42 out of a 86? Just shifting one to the right will not work but is near to the right value (86 shiftright = 43)

Any ideas?

EDIT 2:
Fiddled around a bit more. It almost looks like they calculated the exponent from the one different byte by substracting 129(decimal) from it. I try to verify this tomorrow.
« Last Edit: December 09, 2011, 05:48:42 pm by RetroHelix »

Zande

  • Jr. Member
  • **
  • Posts: 15
    • View Profile
Re: Floating point number representation
« Reply #13 on: December 12, 2011, 07:36:26 am »
It seems they are stored as the old floating point type used in Pascal. Allthough the Pascal type, called Real48, used 48 bits for this software impemented floating type...

262.50 in the pascal real48 type would be 0x034000000089 (stored in your file as 0x00400389) The least significant byte (0x89 in this case) is the exponent, and uses a bias of 129 (unlike IEEE 754, which uses a bias of 127 for 8 bit exponents). The most significant bit is the sign bit, and the remaining 39 bits makes up the mantissa. (Just for the record, IEEE754 floats stores the exponent and mantissa the other way around, the mantissa first starting from the least significant bit).
If the values in your file are stored only in 32 bits (4 bytes) it seems they've dropped 16 bits from the mantissa, which is possible but lowers the precision. But also it seems the three bytes making up the mantissa are stored in reversed order.

I don't know what languages you use/know, but I threw together a C++ function which may convert to double, should work I think... Oo
Code: [Select]
double Convert(unsigned long Value) {
// Flip back the byte order of the mantissa.
Value = (Value & 0x00FF00FF) | ((Value & 0xFF000000) >> 16) | ((Value & 0x0000FF00) << 16);

// Exponent bias and mantissa scale.
const double Bias = 129.00;
const double Scale = static_cast<double>(1 << /* Number of bits of the mantissa */ 23);

// Seperate the sign bit, exponent and mantissa.
double Sign = (Value & 0x80000000) ? -1.00 : 1.00;
double Exponent = static_cast<double>(Value & 0xFF);
double Mantissa = static_cast<double>((Value >> 8) & 0x7FFFFF);

// Finally apply abit of magic...
return Sign * pow(2.00, Exponent - Bias) * (1.00 + (Mantissa / Scale));
}

RetroHelix

  • Full Member
  • ***
  • Posts: 148
    • View Profile
Re: Floating point number representation
« Reply #14 on: December 12, 2011, 09:32:54 am »
Yeah, thats how its done. Good job.  :thumbsup:
Your function works fine. I only use C# these days but I learned some C++ (and some basics of other languages) years ago.

When I wrote my last edit it was almost clear that the exponent is made out of the last byte. Seeing the other bytes in reverse order I calculated one value per hand and lost interest in investigating more since it worked. I don't even tried to write some code ^^
Its nice to see how elegant you brought the bytes back in order. I think I would have made it by copying every single byte into a new bytearray.

Thank you, now I can complete my tool. Its now possible to use a new pc with uptodate software and in case the old data is needed my tool. Mmmh, maybe I have a look at the new software and write an importer for it :D