Developers workshops (hackathons)

Venue

 * KB, The Hague, NL
 * Hotels

Participation
Max. 20 participants. Participation in the event is FREE OF CHARGE!

Register here

Date
1.5 days from 19-20 September 2013

Schedule
12:00 - 12:30  Registration, Coffee, meet & greet 12:30 - 13:15  Lunch 13:15 - 13:30  Presentation: Background & Topics 13:30 - 14:00  Introduction of participants, forming of groups 14:00 - 17:00  Hacking time 17:00 - 17:30  Status round-up 19:00 - 21:00  Option for a social dinner (self-paid) at Taj Mahal Den Haag 09:30 - 10:00  Coffee 10:00 - 12:30  Hacking time / in parallel: WP2 meeting 12:30 - 13:30  Lunch 13:30 - 16:00  Hacking time / in parallel: WP2 meeting 16:00 - 17:00  Presentation of results
 * Day 1 (19 September 2013)
 * Day 2 (20 September 2013)

Results
Please visit the blog.

Venue

 * UA, Alicante, ES
 * Hotels

Participation
Max. 20 participants. Participation in the event is FREE OF CHARGE!

Register here

Date
1.5 days from 10-11 April 2014

Our Hackathon is featured in HackathonHero.com's List of Hackathons.

Schedule
12:00 - 12:30  Registration, Coffee, meet & greet 12:30 - 13:15  Lunch 13:15 - 13:30  Presentation: Background & Topics 13:30 - 14:00  Introduction of participants, forming of groups 14:00 - 17:00  Hacking time 17:00 - 17:30  Status round-up 19:00 - 21:00  Option for a social dinner (self-paid) 09:30 - 10:00  Coffee 10:00 - 12:30  Hacking time 12:30 - 13:30  Lunch 13:30 - 16:00  Hacking time 16:00 - 17:00  Presentation of results
 * Day 1 (10 April 2014)
 * Day 2 (11 April 2014)

Results
Please visit the blog.

Topics

 * OCR Training
 * Training Tesseract
 * Training Gamera


 * Evaluation
 * Fine-tuning open-source Java tool for OCR Evaluation


 * Language Technology
 * Use cases and interfaces for historical dictionaries in OCR
 * OCR & NLP


 * Workflows & Interoperability
 * Scientific workflows: Taverna, Meandre
 * Taverna2 Server: interfaces, clients
 * Local tool integration (tool services) in Taverna
 * Interoperability with web services from CLARIN-DE/NL, e.g. Weblicht
 * Issues from interoperability-framework
 * Training in using the IIF, Taverna and all related components


 * Other
 * XSLT stylesheets for format conversion, e.g. hOCR, PAGE, FRXML (see also format-converter)
 * Debian package generation – using SCAPE toolwrapper, instructions
 * OCR & DIA with PLAiR tools
 * Improving documentation, creating tutorials and training materials
 * Test, experiment, break things and have fun doing so!

Tools
Many tools for image processing, OCR etc. from IMPACT as well as other projects will be made available for the duration of the event as web services and workflow descriptions for the Taverna workflow management system, see also: Other relevant tools:
 * IMPACT Centre of Competence Demonstrator Platform
 * A very comprehensive list of tools for text digitisation from Succeed
 * Tools for (scalable) digital preservation from SCAPE - also available as workflows
 * Inventory of tools from CLARIN
 * Digital research tools from Bamboo DiRT

Data
The datasets we will work with typically comprise of a combination of (high-res) images (TIF, JP2), ground truth transcriptions incl. layout coordinates (PAGE) as well as descriptive metadata (XML, CSV).

The following datasets are freely available:
 * Datasets from IMPACT:
 * Polish digital libraries dataset, licensed CC-BY
 * Biodiversity heritage library dataset, licensed CC-BY

During the event we will also have access to more comprehensive datasets such as the full IMPACT dataset as well as a newspapers dataset from Europeana Newspapers.


 * Other known datasets:
 * iDigBio OCR datasets