ScroogeXHTML
RTF to HTML converter library

Released 22 April 2021

ScroogeXHTML for the Java platform is a library which supports a subset of the Rich Text Format (RTF) standard and converts to HTML5 and XHTML, as a standalone document, or as a fragment which can be embedded in other documents.

link

It supports advanced document features including hyperlinks, bookmarks, footnotes, field results and simple tables

format_size

It minimizes documents using CSS font definitions and generation of local font styles only for text with different attributes

find_replace

It includes an API for post-processing of the intermediate DOM tree which allows additional fine tuning, based on XPath

settings

It is easy to use, compact and fast, and requires no external runtime libraries except the SLF4J logging facade

language

The web-based online demo converts a RTF document and displays the result HTML5

Try the demo
portrait

Picture data extraction, complimentary conversion code for BMP/JPG/PNG images to Data URIs

Conversion example 1

The RTF for this example has been created from the original RTF specification using LibreOffice.

Input (RTF document) - as shown in MS Wordpad
example 1 RTF
Notes
  • Wordpad auto-detects URLs and renders them as hyperlinks, even if they do not use a HYPERLINK field. With ScroogeXHTML, this auto-detection can be emulated in a post-processing step.
  • Wordpad renders table borders with a light grey line, even if it has no border style. This may be achieved with CSS in ScroogeXHTML.
Output (HTML document) - as shown in Mozilla Firefox
example 1 HTML
Notes
  • To create similar grey borders around table cells as in Wordpad, the custom CSS for the conversion includes a custom border format:
td {
  vertical-align: top;
  border: 1px solid silver;
}

Conversion example 2

The RTF for this example has been created using LibreOffice 6.1.6.3, then opened in MS Wordpad.

Input (RTF document) - as shown in MS Wordpad
example 2 RTF
Notes
  • The first picture uses the JPG format, the second uses PNG
Output (HTML document) - as shown in Mozilla Firefox
example 2 HTML
Notes

Code example

code

Java code

public class Main {
    public static void main(String ... args) {
        ScroogeXHTML converter = new ScroogeXHTML();
        String html = converter.convert("{\\rtf1 Hello world}");
        System.out.println(html);
    }
}

HTML output

<p>Hello world</p>

Font style example

code

Java code

public class Main {
  public static void main(String ... args) {
    ScroogeXHTML converter = new ScroogeXHTML();
    String html = converter.convert("{\\rtf1 {\\b Bold \\i Bold Italic \\i0 Bold again} \\par}");
    System.out.println(html);
  }
}

HTML output

<p>
    <span style="font-weight:bold;">Bold </span><span style="font-weight:bold;font-style:italic;">Bold Italic </span><span style="font-weight:bold;">Bold again</span> 
</p>
Text properties
Bold/italic/underlined ✓/✓/✓
Foreground/background/highlight color ✓/✓/✓
Subscript/superscript ✓/✓
Strikethrough/hidden ✓/✓
LTR/RTL text ✓/✓
Unicode/DBCS text ✓/✓
Language attribute
Paragraph properties
Left/right/centered/justified alignment ✓/✓/✓/✓
Left/right/first line indent ✓/✓/✓
Background color
Border box color/width ✓/✓
Space before/after ✓/✓
Numbered/unnumbered lists ✓/✓
Tabulators replaced by a sequence of non breaking spaces
Tables
Simple tables / Nested tables ✓/—
Total width
Left margin
Column width
Horizontal cell merging
Cell background color
Row height
Cell border color/width ✓/✓
Other content
Hyperlink fields
Bookmark fields
Footnotes
Output document types
HTML5
XHTML 1.0 Transitional
Optimization
CSS based document minimizer
Font name substitution
Logging framework
SLF4J support
Java platform
Minimum supported platform Java SE 8
Tested with JDK 8/11/14/15/16 ✓/✓/✓/✓/✓
Dependencies
SLF4J
Advanced features
Picture data extraction ② ③ ✓ (BMP/EMF/JPG/PICT/PNG/WMF)
Header/Footer text
Post processing (W3C DOM based)
Tests
JUnit tests
Other
Installer and uninstaller

① uses direct text formatting, does not generate <ol>/<ul>/<li> HTML elements

② conversion of the extracted raw image data to a web-safe format is not included

③ the included MemoryPictureAdapterBase64 supports embedding of PNG and JPG pictures

④ experimental / unsupported feature

⑤ only included in the Source Edition

Released 22 April 2021

book

Manual

The "Getting Started" user guide

search

API Docs

Online API documentation

list

Release Notes

Enhancements and bug fixes

help_outline

FAQ

Frequently Asked Questions

alternate_email

Contact

Support and sales inqueries

pages

Blog

Technical articles and announcements

Released 22 April 2021

get_app

Installer — Source Edition:

Download »

get_app

Installer — Jar edition:

Download »

General

get_app

On the library home page you will find a link to the download area for registered users. The credentials (user name and password) will be sent to you when a new release is available. New releases will also be announced on the blog at https://scroogexhtml.wordpress.com/

Go to the download section

check

Yes, a free online demo version is available at https://www.scroogexhtml.com/sxd - To check if the library meets your requirements, you may also purchase a Single Developer license, which includes a 14 days full money back guarantee. This allows to test the full version of the library without any risk. The reseller (ShareIt) will give a full refund if you find that the library does not work as expected.

Go to the online demo

Picture Support

help_outline

The library does not convert embedded pictures. It extracts the raw picture data from the RTF document (and some meta data). The extracted picture data may be in PNG, JPG, BMP, WMF, or other formats. The library may contain complimentary code for picture processing support, but this code is experimental and unsupported.

Developer License

check

Yes, each developer that uses our products must have their own license.

No, developer licenses are perpetual. However, you will be using the last product version released before your free upgrade period expired. They may also may be revoked in case of license violations, or violations of ShareIt terms of sale.

check

Yes! If you are a registered user, please contact Habarisoft and ask for a discount coupon code.

Source Code Distribution

block

No, the source code is not redistributable, even if you change it. Under no circumstances is it acceptable to disclose the source to any third party.

Server Deployment License

help_outline

License types (Single Developer License and Server Deployment License) are explained on the page https://www.scroogexhtml.com/scroogexhtml_license.html.

ScroogeXHTML License Types

No. A Server Deployment License covers all applications on the server.

help_outline

Server Deployment Licenses are available under two license models, 'Perpetual' or 'Subscription'. When licensed as a subscription, the license expires when the subscription ends. Server Deployment Licenses may also expire in case of license violations, or violations of ShareIt terms of sale.

No. The proof of purchase for Server Deployment Licenses is your license document.

block

No. ScroogeXHTML Server Deployment Licenses or parts of it may not be distributed, sold, rented or transferred to any other party, this includes mergers and acquisitions of the license holder.

Actiance

Advocate Health Care

APT Business Solutions

Artisan Design Group

Becton, Dickinson and Company

Bracari

Canadian Natural Resources

Criterion Security Services

Datamine

Denim Group

e-vendo

Fatax

GE Medical Systems

Glencore International AG

Include Software

Iodine Software

洁茹 牛

Mayo Foundation

Manuh Solutions

Micrologos Software Developer

NVISIA

NxGen Software

PAGU.at

Philips Medical Systems

ProClarity Corporation

Promutuel Assurance

QuadraMed Corporation

Saxos Informatica

Scherer Software

Sigmalogic

Stanford University

TIP Technologies

YADA Systems

format_quote

"This is an excellent unique product that has saved us many hours of work. It is simple to use with lots of documentation." — Stewart S., UK

format_quote

"Scrooge has really helped me out! I'm converting a database of 10,000 questions in RTF into HTML, Scrooge has been invaluable!" — J. M., USA

format_quote

"It works great and it is stunningly fast - on production, converting 115.000 documents takes 25 seconds instead of 9 hours." — Robert S., Germany

format_quote

"We are very, very glad with this Component" — M. R., Germany