Thursday, May 09, 2013
Version 2.2 of bson4jackson
has just been released. bson4jackson adds support for BSON, a binary
representation of JSON, to the Jackson JSON processor.
The latest release of bson4jackson now supports Jackson 2.2. Apart from that,
Ben McCann and John Stoneham
fixed the Maven dependencies and updated some 3rd party libraries, so project builds
depending on bson4jackson should now be more stable. Thanks a lot for that, guys!
Projects using bson4jackson
bson4jackson is used in several other Open Source projects including the following one:
-
Jongo is a rather cool library that allows MongoDB to
be queried in Java just like you would query it in the MongoDB shell. Jongo
uses bson4jackson to serialize objects before they are sent to the database,
and of course to deserialize queried documents.
http://jongo.org/
-
MongoJack is a POJO mapper that uses Jackson and bson4jackson to serialize
and deserialize objects before they are sent to the database. MongoJack
is extremely fast and very easy to handle.
http://mongojack.org/
I know that there are a lot of other projects out there that use bson4jackson.
If you want your project to be added to this list please leave a comment
below or send me a message.
For a complete description of bson4jackson (including how to download it)
have a look at my tutorial.
Tags: BSON, bson4jackson, Jackson, Java, JSON
Thursday, May 02, 2013
I’ve just released the source code of the Spamihilator website
as Open Source. You can download it from the following
GitHub repository:
https://github.com/michel-kraemer/spamihilator.com
Everyone is invited to make contributions! I’m open to all kind
of changes. You may submit new content (e.g. FAQ), change the design or
style, etc.
If you want to contribute please fork the GitHub repository and send me
pull requests. I will check and upload them to the Spamihilator web server
as soon as possible. Further instructions can be found below or
in the README file.
Building
The Spamihilator website has been created using Jekyll. If
you want to build it please follow these steps:
-
Download and install Ruby 1.8.7 (if you haven’t done so already).
Under Windows I recommend to use RubyInstaller.
Under Linux and Mac OS I highly recommend to use rvm
as the repository already contains proper .ruby-version and
.ruby-gemset files.
-
Install the bundler gem (if you haven’t done so already).
gem install bundler
-
Clone or download this repository.
-
Open a command line shell in the cloned directory and enter the
following command:
bundle install
-
After that you are ready to build the website using the following
commands:
compass compile
jekyll
-
Repeat these commands whenever you make a change. The files will be
compiled to the subdirectory _site.
Run locally
You may also run and test the website locally before uploading your
changes. In order to do this, follow the instructions above and then
run the following command:
jekyll --server
Launch a web browser and open http://localhost:4000 to view the site.
For more information see Jekyll’s website.
License

If not noted otherwise the files in the Spamihilator website
repository are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Tags: Spam, Spamihilator, Open Source
Saturday, March 30, 2013
This is an English translation of my German blog post.
I updated it so the measures described here are compatible to the
latest phpBB 3.x version.
phpBB is an open-source forum software that
is very popular and widely used. This makes it an ideal target for
spammers. The phpBB developers therefore implemented an improved
Captcha in version 3.0. But spammers have already adapted to this and
have implemented improved bots that are able to break the new Captcha
and to automatically create junk posts. In the following I will describe
five anti-spam measures that effectively reduce spam in every phpBB 3.x
instance. The main goal of these measures is to block as many spam
posts as possible without affecting normal forum users.
Measure #1: links
Typically spammers try to advertise certain websites. About 95%
of all spam posts contain links or URLs. The most effective way to
block those posts is to completely forbid links. However, this would
also affect normal forum users.
Spammers usually sign in to a forum and then immediately start posting junk.
We can make use of this and forbid links only to guests and users with
less than a certain number of posts. Once a normal user has reached
this number of posts links will be enabled. Typical spam bots will
never reach this number since all their posts will be blocked.
To forbid posting links you have to add the following to the function submit_post()
in the file includes\functions_posting.php.
//Define the minimum number of posts for "good" users
//Users below this threshold are considered potential spammers
$user_posts_threshold = 3;
//strip whitespace characters in the post body
$msgwows = $data['message'];
$msgwows = str_replace(" ", "", $msgwows);
$msgwows = str_replace("\n", "", $msgwows);
$msgwows = str_replace("\r", "", $msgwows);
$msgwows = str_replace("\t", "", $msgwows);
if (!$user->data['is_registered'] ||
$user->data['user_posts'] < $user_posts_threshold) {
if (strpos($msgwows, 'http://') !== FALSE ||
strpos($msgwows, 'ftp://') !== FALSE ||
strpos($msgwows, 'www.') !== FALSE ||
strpos($msgwows, '[url') !== FALSE) {
trigger_error("You are not allowed to post URLs!");
}
}
This code should be put at the beginning of the submit_post() function
to check all posts before they are saved to the database.
Measure #2: images
Spammers often try to trick spam filters by posting images instead of text.
They put their junk messages and links into image files and then attach them
to forum posts. You can use the same technique as the one described in
measure #1 to block images for guests and users with less than a
certain number of posts. Put the following code in the function submit_post()
in the file functions/functions_posting.php.
if (!$user->data['is_registered'] ||
$user->data['user_posts'] < $user_posts_threshold) {
if (strpos($msgwows, '[img') !== FALSE) {
trigger_error("You are not allowed to post images!");
}
}
Measure #3: Russian and Chinese posts?
A lot of spam posts are written in Russian or Chinese or simply
contain a lot of special characters and garbage. If your forum
is targeted to English users you can check if a post is written in
English before it is submitted. Posts that mostly contain special
characters or foreign characters can then be treated as spam.
Cory Mawhorter has developed a
PHP funktion
(is_english()) that is able to recognise special characters. You can
use this function to differentiate English posts from any other.
if (!$user->data['is_registered'] ||
$user->data['user_posts'] < $user_posts_threshold) {
if (!is_english($msgwows, 0.75)) {
trigger_error("Only English posts are allowed here!");
}
}
Measure #4: http:BL
Project Honey Pot offers an
effective system to keep spammers and mail address harvesters away from
websites. http:BL matches
the website visitor’s IP address against a database. If the IP address is
known to be used by a spammer the visitor will be blocked before the
website is even rendered. The system uses DNS which makes queries
very fast.
In order to use http:BL you first have to sign up for Project Honey Pot.
You will receive a special key that is used to authenticate against
the system. They already offer a MOD for phpBB
but it is only compatible to version 2.0. You may be able to make it
compatible to phpBB 3, but alternatively you can simply put the following
code at the end of the file common.php.
//configure your http:BL Access Key here
$httpblkey = "xxxxxxxxxxx";
$httpblmaxdays = 21;
$httpblmaxthreat = 25;
//if you already configured a honey pot on your website use this line:
//$httpblhoneypot = "http://xxxxxxxxxxx";
function httpbl_check() {
global $httpblkey, $httpblmaxdays, $httpblmaxthreat, $httpblhoneypot;
$ip = $_SERVER["REMOTE_ADDR"];
$result = explode(".", gethostbyname($httpblkey."."
.implode(".", array_reverse(explode(".", $ip)))
.".dnsbl.httpbl.org"));
if ($result[0] != 127) {
//something went wrong or the IP is not in the database.
//ignore this one.
return;
}
$days = $result[1];
$threat = $result[2];
if ($days < $httpblmaxdays && $threat > $httpblmaxthreat) {
if ($httpblhoneypot) {
header("HTTP/1.1 301 Moved Permanently");
header("Location: ".$httpblhoneypot);
}
die();
}
}
httpbl_check();
Please make sure to put your http:BL access key in the variable $httpblkey.
Measure #5: Akismet
Another technique to block Internet spam is Akismet.
This system is usually used in WordPress blogs to block comment spammers.
Just like for Project Honey Pot you need to sign up
to receive an API key.
You can use Akismet to block posts in phpBB 3 forums as well. The system
may produce false positives (normal posts accidentally identified as spam).
I therefore recommend to only check the first posts of a new user until
he or she has reached a certain number of posts. The following code
uses the file Akismet.class.php that can be downloaded from
Alex Potsides’ blog or from
his GitHub repository. Put
the code in the function submit_post()
in the file includes/functions_posting.php.
//configure your Akismet API key here
$akismet_key = 'xxxxxxxxxxx';
//the URL you entered when you registered for a Wordpress account
$akismet_url = 'xxxxxxxxxxx';
include('Akismet.class.php');
$akismet = new Akismet($akismet_url, $akismet_key);
if (!$user->data['is_registered'])
$akismet->setCommentAuthor($username);
else
$akismet->setCommentAuthor($user->data['username']);
$akismet->setCommentContent($data['message']);
$akismet->setUserIP($user->ip);
if ($user->data['is_registered'])
{
$akismet->setCommentAuthorEmail(strtolower($user->data['user_email']));
$akismet->setCommentAuthorURL(strtolower($user->data['user_website']));
}
if ((!$user->data['is_registered'] ||
$user->data['user_posts'] < $user_posts_threshold) &&
$akismet->isCommentSpam()) {
trigger_error("Akismet says your post is spam");
}
Put your Akismet API key into the variable $akismet_key. The URL
you entered during sign-up has to
be put in the variable $akismet_url.
Akismet can also be used reasonably to block spammers who try to
sign up to your forum. Put the following code into the function
user_add() in the file includes/functions_user.php.
//configure your Akismet API key here
$akismet_key = 'xxxxxxxxxxx';
//the URL you entered when you registered for a Wordpress account
$akismet_url = 'xxxxxxxxxxx';
include('Akismet.class.php');
$akismet = new Akismet($akismet_url, $akismet_key);
$akismet->setCommentAuthor($username_clean);
$akismet->setUserIP($user->ip);
$akismet->SetCommentAuthorEmail(strtolower($user_row['user_email']));
if($akismet->isCommentSpam()) {
trigger_error("Akismet says you are a spammer");
}
Conclusion
The measures presented here help drastically reduce spam in
phpBB 3.0-based forums. Since I implemented them in the
Spamihilator forum a couple
of years ago only a very small number of spammers were actually able to
post. However, none of their messages contained links, URLs or images.
They mostly consisted of a number of meaningless and motley words.
Forbidding links and images is in my experience the most effective way
to block spammers. Searching for special characters and foreign languages
blocks all other spam posts that do not contain links or images.
Normal users are typically not affected by these measures. As soon as
a normal user reaches a certain number of ‘good’ posts the anti-spam measures
are disabled. Up to now, in the Spamihilator forum
no spammer was able to reach this limit. 3 or 5 posts
is in my experience a good threshold. If ever needed, this limit can easily
be raised.
Spammers often try to put links and images into signatures. I highly recommend
to disable this in phpBB’s administration area. You may also try to apply
the link and image filters from measure #1 and #2 respectively to signatures.
Many phpBB forums plagued by spammers disable guest posts. Users have to be registered to post.
For support forums like Spamihilator’s
this can be tedious for users who would like to easily post support requests
without have to go through the complete sign-up procedure. The measures presented
here allow forum administrators to leave guest posts enabled.
Tags: Akismet, phpBB, Project Honey Pot, Spam
Monday, April 09, 2012
Version 2.0 of bson4jackson
has just been released. bson4jackson adds support for BSON, a binary
representation of JSON, to the Jackson JSON processor.
The latest release of bson4jackson now supports Jackson 2.0.
Enda O’Donohoe fixed two bugs regarding the UTF-8 decoder. Thanks for that!
Support for Jackson 2.0 was greatly supported by James Roper.
Thanks again for your contributions, James!
Support for older Jackson versions will be dropped with bson4jackson 2.0. If
you’re looking for a version supporting the Jackson 1.x branch, then
please download bson4jackson 1.3.0.
For a complete description of bson4jackson (including how to download it)
have a look at my tutorial.
Tags: BSON, bson4jackson, Jackson, Java, JSON
Saturday, December 17, 2011
Version 1.2.0 of bson4jackson
has just been released. bson4jackson adds support for BSON, a binary
representation of JSON, to the Jackson JSON processor.
Thanks to contributions from the community, the latest release of
bson4jackson now includes better support for MongoDB.
Gergő Ertli has fixed the support for the
ObjectId type. Object IDs are used as the primary key for MongoDB documents.
Support for the UUID type has been added by Ed Anuff.
He added a new module which can be registered to Jackson’s ObjectMapper:
ObjectMapper om = new ObjectMapper(new BsonFactory());
om.registerModule(new BsonUuidModule());
Thanks to the contribution by James Roper
the BsonParser class now supports the new HONOR_DOCUMENT_LENGTH
feature which makes the parser honor the first 4 bytes of a document which
usually contain the document’s size. Of course, this only works if
BsonGenerator.Feature.ENABLE_STREAMING has not been enabled during
document generation.
This feature can be useful for reading consecutive documents from an
input stream produced by MongoDB. You can enable it as follows:
BsonFactory fac = new BsonFactory();
fac.enable(BsonParser.Feature.HONOR_DOCUMENT_LENGTH);
BsonParser parser = (BsonParser)fac.createJsonParser(...);
Apart from that, a lot of other minor bugs have been fixed. The library
has been tested with Jackson 1.7 up to 1.9.
For a complete description of bson4jackson (including how to download it)
have a look at my tutorial.
Tags: BSON, bson4jackson, Jackson, Java, JSON, MongoDB