I'm about to start the conversion of a 1980'ies book from book to html.
One of the quirky things was its lay-out, 635 pages of text, 40
characters wide pages, fixed pitch font (Courier?) and justified,
sometimes looking a bit silly, when only five words fit on a line, like, spaces replaced by underscores
that___appear___here.___So__here's__the
Whereas in earlier conversions, like <https://prino.neocities.org/www/a-und-a/anhalterwesen-und-anhaltergefahren.html>
I've use proper html, converted footnotes to end-of-chapter notes, etc
I'm not sure what I should do here, as part of the appeal of this long-out-of-print book is it's formatting.
Any suggestions?
Robert
I'm about to start the conversion of a 1980'ies book from book to html. One of
the quirky things was its lay-out, 635 pages of text, 40 characters wide pages,
fixed pitch font (Courier?) and justified, sometimes looking a bit silly, when
only five words fit on a line, like, spaces replaced by underscores
that___appear___here.___So__here's__the
Whereas in earlier conversions, like <https://prino.neocities.org/www/a-und-a/anhalterwesen-und-anhaltergefahren.html> I've
use proper html, converted footnotes to end-of-chapter notes, etc I'm not sure
what I should do here, as part of the appeal of this long-out-of-print book is
it's formatting.
Any suggestions?
Robert
I'm about to start the conversion of a 1980'ies book from book to html. One of
the quirky things was its lay-out, 635 pages of text, 40 characters wide pages,
fixed pitch font (Courier?) and justified, sometimes looking a bit silly, when
only five words fit on a line, like, spaces replaced by underscores
that___appear___here.___So__here's__the
Whereas in earlier conversions, like <https://prino.neocities.org/www/a-und-a/anhalterwesen-und-anhaltergefahren.html> I've
use proper html, converted footnotes to end-of-chapter notes, etc I'm not sure
what I should do here, as part of the appeal of this long-out-of-print book is
it's formatting.
Any suggestions?
Robert
their flow. Even simple things like printing or saving documents are difficult because standard browser commands don't work. Layouts are
often optimized for a sheet of paper, which rarely matches the size
of the user's browser window. Bye-bye smooth scrolling. Hello tiny
fonts.
Worst of all, PDF is an undifferentiated blob of content that's hard
to navigate.
PDF is great for printing and for distributing manuals and other big documents that need to be printed. Reserve it for this purpose and
convert any information that needs to be browsed or read on the
screen into real web pages.
I'm about to start the conversion of a 1980'ies book from book to html.
On 4/2/2022 11:29 AM, Robert Prins wrote:
I'm about to start the conversion of a 1980'ies book from book to html. One of
the quirky things was its lay-out, 635 pages of text, 40 characters wide pages,
fixed pitch font (Courier?) and justified, sometimes looking a bit silly, when
only five words fit on a line, like, spaces replaced by underscores
that___appear___here.___So__here's__the
Whereas in earlier conversions, like
<https://prino.neocities.org/www/a-und-a/anhalterwesen-und-anhaltergefahren.html> I've
use proper html, converted footnotes to end-of-chapter notes, etc I'm not sure
what I should do here, as part of the appeal of this long-out-of-print book is
it's formatting.
Any suggestions?
Robert
DO NOT DO IT! At least do not try to replicate the appearance of the
book. You might break the book down into chapters, each in a separate
Web page subsidiary to a Table of Contents page. But do not try to
replicate each of the book's pages as a separate Web page. No one will
want to read it.
Also, do not replicate the book as a PDF file. Jacob Nielsen, an expert
on Web design and usability, says:> Users hate coming across a PDF file
while browsing, because it breaks
their flow. Even simple things like printing or saving documents are difficult because standard browser commands don't work. Layouts are
often optimized for a sheet of paper, which rarely matches the size
of the user's browser window. Bye-bye smooth scrolling. Hello tiny
fonts.
Worst of all, PDF is an undifferentiated blob of content that's hard
to navigate.
PDF is great for printing and for distributing manuals and other big documents that need to be printed. Reserve it for this purpose and
convert any information that needs to be browsed or read on the
screen into real web pages.
Here is what I would do --
Scan the book a page at a time with optical character recognition (OCR) software. Edit the result since OCR software is generally not perfect. Combine the resulting files into one file per chapter.
Create a master CSS file for formatting. Include spoecifying a variable-width font such as Georgia or Verdana, which are easier to read
on a computer monitor than Courier or other fixed-width fonts. Also, variable-width fonts render fully justified without the problem you
cite. Avoid specifying a font that is not generally available or that
is overy artistic. Also avoid specifying colors for either the font or background.
Manually insert the necessary HTML markup into each chapter's file,
including links to the master CSS file. Create the Table of Contents
HTML file, listing only the chapters by number and title and with links
to them. Having previously proofed the OCR results, now proof read the
Web pages; better, have someone else read the Web pages aloud to you.
Test the chapters and Table of Contents at <http://validator.w3.org/>.
Test the CSS file at <http://jigsaw.w3.org/css-validator/>. Correct ALL errors.
Yes, this will be tedious, especially for a 635 page book. I have done
this with smaller documents, including a two-page newsletter that
becomes a single Web page and multiple page newspaper articles that
again become single Web pages.
On Sat, 2 Apr 2022 18:29:11 +0000, Robert Prins wrote:
I'm about to start the conversion of a 1980'ies book from book to html.
Are you refering to IBM's BookManager `book` format,
or something on paper?
I'm about to start the conversion of a 1980'ies book from book to html.
One of the quirky things was its lay-out, 635 pages of text, 40
characters wide pages, fixed pitch font (Courier?) and justified,
sometimes looking a bit silly, when only five words fit on a line, like, spaces replaced by underscores
that___appear___here.___So__here's__the
Robert Prins wrote:
I'm about to start the conversion of a 1980'ies book from book to html. One of
the quirky things was its lay-out, 635 pages of text, 40 characters wide
pages, fixed pitch font (Courier?) and justified,
If the entire book is in exactly that format and you wish to preserve the format, then it is pointless to convert it to HTML, since plain text is so much
more natural. Well, if there is a technical reason to make it HTML, just slap the entire content inside <pre> element. Well, if you want to keep the exact pagination as well, make each page a <pre> element and use the CSS code
pre { page-break-before: always }
sometimes looking a bit silly, when only five words fit on a line, like,
spaces replaced by underscores
that___appear___here.___So__here's__the
To preserve the formatting, in a plain text file or inside <pre>, you would need
to have the same number of spaces as in the printed book.
I first though you could do things more flexibly. Since digital scanning hardly
gets the number of spaces right automatically, you could let it produce text with single spaces between words and then use e.g.
<p>that appear here. So here's the...</p>
with
p { width: 40ch; font-family: monospace; text-align: justify; }
But it does not produce the same result. Text justification in browsers does not
simply add space characters but instead stretches spacing between words evenly.
Perhaps you should first analyze how the book content is best presented in modern media, instead of assuming that the original format should be preserved,
even if it was simply caused by the limitations of tools.
(About 40 years ago, I co-authored a book that was published as typewritten, in
a monospace font. If someone wanted to publish it now in digital format, it would be pointless to imitate that format, and I would surely use my rights to
forbid that!)
I'm about to start the conversion of a 1980'ies book from book to html. One of the quirky things was its lay-out, 635 pages of text, 40 characters wide pages, fixed pitch font (Courier?) and justified,
Robert Prins wrote:
I'm about to start the conversion of a 1980'ies book from book to html. One of
the quirky things was its lay-out, 635 pages of text, 40 characters wide
pages, fixed pitch font (Courier?) and justified,
Perhaps you should first analyze how the book content is best presented in modern media, instead of assuming that the original format should be preserved,
even if it was simply caused by the limitations of tools.
(About 40 years ago, I co-authored a book that was published as typewritten, in
a monospace font. If someone wanted to publish it now in digital format, it would be pointless to imitate that format, and I would surely use my rights to
forbid that!)
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 418 |
Nodes: | 16 (2 / 14) |
Uptime: | 08:37:37 |
Calls: | 8,802 |
Files: | 13,301 |
Messages: | 5,968,734 |