ScroogeXHTML
RTF to HTML converter library

Released 29 May 2021

ScroogeXHTML is a library which supports a subset of the Rich Text Format (RTF) standard. It converts RTF to HTML5 and XHTML standalone documents, or to fragments which can be embedded in other documents.

link

Supports hyperlinks, bookmarks, multi-language and LTR/RTL text, field results and simple tables

format_size

Minimizes documents using CSS font definitions and generation of local font styles for text with different attributes

find_replace

Includes an API for post-processing of the intermediate DOM tree to support additional fine tuning

settings

Easy to use, compact and fast, and requires no external runtime libraries except the SLF4J logging facade

language

The web-based online demo converts a RTF document to HTML5

Online demo
portrait

Picture data extraction, complimentary code for embedding BMP/JPG/PNG images as Data URIs

Conversion example 1

The RTF for this example has been created from the original RTF specification using LibreOffice.

Input (RTF document) - as shown in MS Wordpad
example 1 RTF
Notes
  • Wordpad auto-detects URLs and renders them as hyperlinks, even if they do not use a HYPERLINK field. With ScroogeXHTML, this auto-detection can be emulated in a post-processing step.
  • Wordpad renders table borders with a light grey line, even if it has no border style. This may be achieved with CSS in ScroogeXHTML.
Output (HTML document) - as shown in Mozilla Firefox
example 1 HTML
Notes
  • To create similar grey borders around table cells as in Wordpad, the custom CSS for the conversion includes a custom border format:
td {
  vertical-align: top;
  border: 1px solid silver;
}

Conversion example 2

The RTF for this example has been created using LibreOffice 6.1.6.3, then opened in MS Wordpad.

Input (RTF document) - as shown in MS Wordpad
example 2 RTF
Notes
  • The first picture uses the JPG format, the second uses PNG
Output (HTML document) - as shown in Mozilla Firefox
example 2 HTML
Notes

Code example

code

Java code

public static void main(String[] args) throws IOException {      
        ScroogeXHTML scrooge = new ScroogeXHTML();
        scrooge.setAddOuterHTML(true);
        String html = scrooge.convert("{\\rtf1 \"Hello world!\"}"); 
        System.out.println(html);
    }

HTML output

<!DOCTYPE html>
<html>
  <head>
    <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>Untitled document</title>
    <meta content="ScroogeXHTML for the Java platform 9.3.0" name="generator">
  </head>
  <body>
    <p>"Hello world!"</p>
  </body>
</html>
Text properties
Bold/italic/underlined ✓/✓/✓
Foreground/background/highlight color ✓/✓/✓
Subscript/superscript ✓/✓
Strikethrough/hidden ✓/✓
LTR/RTL text ✓/✓
Unicode/DBCS text ✓/✓
Language attribute
Paragraph properties
Left/right/centered/justified alignment ✓/✓/✓/✓
Left/right/first line indent ✓/✓/✓
Background color
Border box color/width ✓/✓
Space before/after ✓/✓
Numbered/unnumbered lists ① ✓/✓
Tabulators replaced by a sequence of non breaking spaces
Other content
External Hyperlinks
Internal Links (Bookmarks)
Footnotes
Pictures
Picture data extraction
Extracted picture formats BMP/EMF/JPG/PICT/PNG/WMF
Supported picture storage types Binary / Hexadecimal
Picture data conversion ② -
Tables
Simple tables / Nested tables ✓/—
Total width
Left margin
Column width
Horizontal cell merging
Cell background color
Row height
Cell border color/width ③ ✓/✓
Output document types
HTML5
XHTML 1.0 Transitional
Optimization
CSS based document minimizer
Font name substitution
Logging framework
SLF4J support
Java platform
Minimum supported platform Java SE 8
Tested with JDK 8/11/16 ✓/✓/✓
Dependencies
SLF4J
Advanced features
Header/Footer text
Post processing events
Tests
JUnit tests ④
Other
Installer and uninstaller

① uses direct text formatting, does not generate HTML ol or ul elements

② the complimentary class MemoryPictureAdapterDataURI supports embedding of JPEG and PNG pictures as Data URIs

③ experimental or unsupported feature

④ included in the Source Edition

Released 29 May 2021

book

Manual

The "Getting Started" user guide

search

API Docs

Online API documentation

list

Release Notes

Enhancements and bug fixes

help_outline

FAQ

Frequently Asked Questions

alternate_email

Contact

Support and sales inqueries

pages

Blog

Technical articles and announcements

Released 29 May 2021

get_app

Installer — Source Edition:

Download »

get_app

Installer — Jar edition:

Download »

General

get_app

On the library home page you will find a link to the download area for registered users. The credentials (user name and password) will be sent to you when a new release is available. New releases will also be announced on the blog at https://scroogexhtml.wordpress.com/

Go to the download section

check

Yes, a free online demo version is available at https://www.scroogexhtml.com/sxd - To check if the library meets your requirements, you may also purchase a Single Developer license, which includes a 14 days full money back guarantee. This allows to test the full version of the library without any risk. The reseller (ShareIt) will give a full refund if you find that the library does not work as expected.

Go to the online demo

Picture Support

help_outline

The library does not convert embedded pictures, it only extracts the raw picture data from the RTF document. The library may contain complimentary code for picture processing support, but this code is experimental and not covered by the basic support plan.

Privacy and Security

Upgrading

check

Yes, you may receive a discount coupon code to purchase the Source Edition at a reduced price.

Developer License

check

Yes, each developer that uses our products must have their own license.

No, developer licenses are perpetual. However, you will be using the last product version released before your free upgrade period expired. A license may also be revoked in case of license violations.

Source Code Distribution

block

No, the source code is not redistributable, even if you change it. Under no circumstances is it acceptable to disclose the source to any third party.

Server Deployment License

help_outline

License types (Single Developer License and Server Deployment License) are explained on the page https://www.scroogexhtml.com/scroogexhtml_license.html.

ScroogeXHTML License Types

No. A Server Deployment License covers all applications on the server.

help_outline

Server Deployment Licenses are available under two license models, 'Perpetual' or 'Subscription'. When licensed as a subscription, the license expires when the subscription ends. Licenses may also expire in case of license violations.

No, the proof of purchase for Server Deployment Licenses is your license document.

block

No. ScroogeXHTML Server Deployment Licenses or parts of it may not be distributed, sold, rented or transferred to any other party, this includes mergers and acquisitions of the license holder.

Actiance

Advocate Health Care

APT Business Solutions

Artisan Design Group

Axalta Coating Systems

Becton, Dickinson and Company

Bracari

Canadian Natural Resources

Criterion Security Services

Datamine

Fatax

GE Medical Systems

Glencore International AG

Include Software

Iodine Software

Manuh Solutions

NxGen Software

PAGU.at

ProClarity Corporation

Promutuel Assurance

QuadraMed Corporation

Sigmalogic

Stanford University

TIP Technologies

format_quote

"This is an excellent unique product that has saved us many hours of work. It is simple to use with lots of documentation." — Stewart S., UK

format_quote

"Scrooge has really helped me out! I'm converting a database of 10,000 questions in RTF into HTML, Scrooge has been invaluable!" — J. M., USA

format_quote

"It works great and it is stunningly fast - on production, converting 115.000 documents takes 25 seconds instead of 9 hours." — Robert S., Germany

format_quote

"We are very, very glad with this Component" — M. R., Germany