- Install R
- Make sure that Rscript is running
- Install php-cli (Php command line interface) version>5
- Make sure that php-curl is installed
- Make sure that shell_exec is working in PHP-cli
- Obtain a TagMe API key
- Download the SBounTI package and extract it in an empty directory
- Edit cfg/config.php according to need (such as base urls of resources that will be produced and the TagMe API key)
- Obtain a microblog post dataset about 5 thousand posts, either
- in a file format of short texts in each line
- or in a raw file retrieved from Twitter streaming API
- Issue command:
- ./sbounti <filename> "<dataset_name>" "<start_date>" "<end_date>"for the text file
- ./sbounti <filename> "<dataset_name>" for the raw Twitter streaming API file
- The produced OWL file contents are written to STDOUT. So, you may want to redirect the output to a file using "> filename.owl" at the end of the command.
- If you have questions please contact Ahmet Yildirim
Source code of Microblog Semantic Topic Identification prototype is published
Our previous study extracts human readable topics given a set of microblog posts. Based on the idea of identifying the topics of a crowd of microblog users, we have recently came up with semantically representing microblog topics for machine consumption.
Source code of the prototype is published.
To install topic identification approach in a linux machine follow the following steps.