News:

11 March 2016 - Forum Rules

Main Menu

[Solved] Line Leveling Problem

Started by flame, October 30, 2015, 04:48:35 PM

Previous topic - Next topic

flame

Me and my group are working on Zero no Kiseki at http://heroesoflegend.org/

I want some pseudocode that will do line leveling. I guess it could apply to any game, though, not just this one, where text appears across multiple lines in the same dialog box.

Given a string that's English text (with spaces) and a number of lines to use, produce a set of strings equal to the number of lines to use, that follow these criteria:

1) Word ordering is preserved
2) The sum of the squares of the line lengths is a minimum. In the event multiple configurations produce least sum of squares, any one of them with the minimum sum of squares is produced. Spaces that were in the original string between line breaks don't count towards the line length for the sum of squares calculation.

So if the input is "The quick brown fox jumped over the lazy dog.", 3 lines, the program would produce:
"The quick brown" len = 15
"fox jumped over" len = 15
"the lazy dog." len = 13
sum of squares = 619, the least of any configuration of these words.

Any ideas on this problem?
Do you just try every combination and spit out the one with the least sum of squares?

Gideon Zhi

Pitfall, in case you haven't considered it: number of lines to use isn't always the same as the number of lines in the window. If "The quick brown fox jumps over the lazy dog" fits in two lines, but the window is three, then splitting it out over three lines will leave a lot of unnecessary empty space to the right of the text.

flame

So is that better, then? To leave space at the bottom, rather than space to the right?

Gideon Zhi

It depends on the language. English is read left-to-right, so it makes more sense visually to fill in to the right before filling in down. Japanese is top-to-bottom, right-to-left, so it makes sense to fill in vertically first.

tryphon

QuoteWhat
do
you
think ?

If you really want to keep your original idea and don't want to try every possibility, count the number of letters, divide by the number of line,  then try to cut lines near the result of the division. Try several solutions not too far and you should find the best choice among them, or not too far.

But then again, I think it's an awful idea.

henke37

You may want to read up on this "latex" system. I hear that it has some fancy linebreaking logic.

Gemini

Unless you have dialog in balloons, it's always preferred to make full use of a line and break whenever you can't fit more. This generally applies to left and right aligned text, centered would require something like what you mentioned, with evenly spaced lines.

tryphon

Quote from: henke37 on October 31, 2015, 06:39:57 AM
You may want to read up on this "latex" system. I hear that it has some fancy linebreaking logic.

Yes it has, but it's overkill. In particular, space between letters and words is variable and calculated in a unit that must be in the order of magnitude of the nanometer...

flame

I should have posted it initially. We're already far along in terms of hacking out the project (still in progress on the translation piece).
The game is Zero no Kiseki (PSP). The lead is also going to try for Zero no Kiseki (PC), because of similar everything between the versions.
Anyway, I made this demo video: https://www.youtube.com/watch?v=UmT5-0Oo9RE
You can get a feel for how the text works.

The most common type is the 5c dialogue you see at the beginning. It's always three lines in the vertical direction. The horizontal direction is determined by how long what they have to say is. It's just one opcode for multiple pages of text, the horizontal box distance is based on the longest string in the whole set.

The other type you can see is at 2 minutes in for NPCs. This one is as many or as few lines as needed, I think up to five or six, but that's not common. The horizontal distance is as before, based on the longest line in the set.

This one needs to be leveled out because its text is big enough for three lines (and the lines are unbalanced):

There's two solutions. There's enough space to fit everything on one line if we widen the box a bit - that just means putting extra text on the first line (and then the rest will fit). The other one is to level it out to least sum of squares. That will make the box smaller. There's quite a lot in this box, so I don't think it really matters. But the current config is sub-optimal for sure.

Next example. This is short enough to fit all on one line. Is that the right decision?


I would stay "stop by our website." But there isn't much to see there yet. If you would like, stop by http://www.heroesoflegend.org/forums/ and say hi, but there's not much to see there either. We mostly need translation help. I wanted a tool like this to help things go more smoothly.

tryphon

You mean you can adapt the frame width at will ? If so, yes it makes more sense.

Nonetheless, your least square criterium is not good. You'd better give a minimum line width, or a max number of lines.

Imagine "a b c d e f g h i j".

Do you prefer
a b c d e f g h i j
(sum of squares : 361)
or :
a
b
c
d
e
f
g
h
i
j

(sum of squares : 10) ?

Bisqwit

#10
Quote from: Gideon Zhi on October 30, 2015, 06:33:47 PMJapanese is top-to-bottom, right-to-left, so it makes sense to fill in vertically first.

Only in some books or in vertical signs. In computer context, it's written the same way as English.

Personally I've been using a simple greedy line wrapper algorithm where the last word that cannot fit entirely on the current line will be placed on a new line, so a minimal number of lines will be used, and the last line is as short as possible.

Surprisingly there is actually a whole field of research to this topic. Especially if you want to justify the text. Things like deliberately avoiding creating diagonal patterns in the spaces on adjacent lines, or recognizing hyphenation, or avoiding breaking certain phrases or proper names. Right now I can't think of the right keywords to find the articles I remember reading...

But here's a good starting point: https://en.wikipedia.org/wiki/Line_wrap_and_word_wrap

In particular, op seems to want to reinvent Knuth's algorithm (optimizing for minimum raggedness in the line lengths). Here is one page that explains this algorithm: http://defoe.sourceforge.net/folio/knuth-plass.html

flame

I wrote this code that does it (based on raggedness):
http://pastebin.com/LfcjmiTk

Thanks Bisqwit for the suggestion.
I didn't see any examples elsewhere for Python implementation of raggedness algorithm.
I did find this algorithm for C++ that does raggedness: http://stackoverflow.com/questions/9071205/balanced-word-wrap-minimum-raggedness-in-php

I used filter() and permutations and combinations to produce the list of combinations to try. I don't understand recursion very well.
Like (1, 1, 1, 1), (1, 1, 1, 2), etc.... Then I used filter() to remove the combinations that don't add up to the correct number of words. Then I used permutations to try (2, 1, 1, 1), (1, 2, 1, 1) etc... for each valid combination.

The only thing it doesn't do yet that I would like it to is choose between solutions where cost is tied. But I think that situation is quite rare. The current operation will produce randomly one of the solutions with the lowest cost, if there are more than one.

BlackDog61

#12
It's funny we're making things so complicated. I do understand the programming perspecive (since i'ver had word wrapping to handle with fixed dialog box size ;)).
Intuitively, I would have applied much simpler rules:
Define max dialog box width in characters (I'll assume 100 for this example.)
- If total number of characters is less than 30,use a single line box.
- Otherwise, try to use balanced lines lengths with a minimum number of lines of 2, always fitting text on the minimum number of lines possible (ex: 2 lines only for 198 character). When balancing words on 2 lines, always make the first one "just a bit" longer (if it cannot be equal).

The obvious disadvantage is thatthe window size changes a lot. Due to the game's animating boxes at each button press, i'm not sure it's a problem? (Does the game avoid playing that  animation is you use the same box size twice in a row for the same character?)

Just a quite note that anything counted in characters now would eventually have to get into pixels some day. :)