Open Text Summarizer
May 23, 2007
Open Text Summarizer is a library and command line tool (developed by Nadav Rotem) that, well, summarizes text.
It seems to do a good job. An auto-summary is an auto-summary. It has its uses and its drawbacks. Sometimes it’s just fun.
It’s included with the [k|x]?ubuntu-desktop packges, so you’ve most likely got it if you’re running ubuntu. It’s included with AbiWord, so if you’ve got AbiWord – which is part of xubuntu-desktop and a great lightweight alternative to OO.o in itself – you’ve got it. You run it with the command
Usage: ots [OPTIONS...] [file.txt | stdin]
-r, --ratio=<int> summarization % [default = 20%]
-d, --dic=<string> dictionary to use
-o, --out=<string> output file [default = stdout]
-h, --html output as html
-k, --keywords only output keywords
-a, --about only output the summary
-v, --version show version information
You can see a screencast and the man page online, if you’re still not ready to commit to installing it.
There’s a list on the Sourceforge page of apps that use the OTS library. There’s only three, so I’ll reproduce it here:
- There was a plugin in the development version of AbiWord at the time the site was written. That development version is now ancient and the stable version now includes it. Oddly enough, I ran both the command line and AbiWord versions on a basic text version of my resume with completely different results. Perhaps they are using a different default dictionary? Should that matter? Usually such documents are not very amenable to auto-summarization. You need something more substantial and less list-like. AbiWord gave me what I expected – gibberish. The command line gave me what might be seen as commentary: a perfectly usable and well-formatted resume with just the essence and all most of the details removed. It was as if it was saying, “Wake up! This is all the HR people will see anyway. They’re all linked against libots anyway.”
- The second is Gnome-Summarizer, a GUI by the author himself. Even more exciting-looking is the “Researcher’s Tool” demonstrated in the screencast of the next version.
- The third is a gedit plugin by Daniel Brodie.
To install just the command line app and the library, use:
sudo apt-get install libots0
There are several AbiWord packages, this will give you the basic one with Gtk2:
sudo apt-get install abiword
(Yeah, I know, that was kind of obvious. By the way, you do know you can use tab completion with apt, right?)