Powering the future through the creation of high-quality, large scale, multilingual learning data for AI/machine learning

Contact us

What our clients are saying

"Baobab creates high quality image annotation data sets for us according to various different requirements. Furthermore, since the annotators are individually managed, we also entrust the company with the annotation of highly sensitive data."

"I have asked Baobab to create data for my research many times, and I really appreciate their willingness and flexibility in responding even to slightly unusual requests. I thoroughly recommend them."

Our services


BAOBAB is a website designed to create translations and linguistic data by cooperating as a group and using machine learning in order to broaden communication and services all over the world. BAOBAB

Learning Data for Machine Translation

Our company began as a service that could offer the extremely large volumes of textual learning data needed for machine translation faster and at a more reasonable price than anywhere else.

Image annotation and tagging/
voice data collection

  • Image annotation and tagging
  • Image captioning
  • Voice transcription

We have created a special in-house tool for image annotation, ensuring speedy and accurate results.


Using Moringa, a mobile app developed by BAOBAB, staff all around the world can collect and tag images, and collect multilingual speech utterances/sounds.

  • Moringa-i, an image collection and tagging tool
  • Moringa-v, a voice data collection and tagging tool

Creating bilingual scenarios
in multiple languages

  • Data for dialogue scenarios, read aloud by native speakers.
  • Simulated conversations between 2 speakers, speaking freely on a predetermined setting.

We create transcribed voice data and written transcriptions of the above.
Click here for sample dialogue scenarios

Developing machine translation engines specialised for particular areas

We develop machine translation engines specialised for particular areas, and provide them as an API. We undertake everything from the creation of learning data, to the development of machine translation engines, and human-powered evaluation of the resulting translations.

  • A machine translation engine that specialises in recipes (Japanese ⇄ English)
Yoko's Yummy Recipes (iOS / Android)


Baobab celebrated its 10th year in business.


NHK, Japan's national broadcasting organization featured Baobab Inc. on their program "Tokorosan, Good Heavens!", an educational show.


We were selected for Google for Startups Accelerator.


Our CEO Miori Sagara will speak at DLLAB Engineer Days.


Became a sponsor, and ran a booth at the CVPR2019.


Our CEO Miori Sagara spoke at a panel session at Microsoft's de:code2019 conference.


We will become a sponsor, and run a booth at the CVPR2019.


Chiba Institute of Technology, AIST and NEDO together release world's largest video caption data set


We will run a booth at CEATEC JAPAN 2018.


Sponsored the 32nd Annual Conference of the Japanese Society for Artificial Intelligence (JSAI).


We have established our U.S. corporation "Baobab America Inc.".


We have started a new service "Region definition and feature point annotation for images ".


Became a sponsor, and ran a booth at the 24th Annual Conference of Natural Language Processing (NLP2018).


Chinese (Simplified) version of our website has launched.


Sponsored the 31st Annual Conference of the Japanese Society for Artificial Intelligence (JSAI).


We are pleased to announce the appointment of Dr Graham Neubig from the Carnegie Mellon University Language Technology Institute as our adviser on April 1st, 2017.


We have moved to a new office.


We released image/voice data collection and tagging tool "moringa".


Sponsoring the 26th International Conference on Computational Linguistics (Coling 2016).


Sponsored the 22nd Annual Conference of Natural Language Processing (NLP2016).


Baobab CEO Miori Sagara was appointed as a delegate of the Association for Natural Language Processing.


Sponsored the 53rd Annual Meeting of the Association for Computational Linguistics (ACL2015).


Baobab CEO Miori Sagara participated in the workshop at National Institute of Informatics as a guest speaker.


Began an Image Gathering and Annotation Service


Sponsored the 25th International Conference on Computational Linguistics (Coling 2014)


Participated in "From Minato-ku! Latest Global Business Seminar" as a guest speaker.


Aided in the 25th International Conference on Computational Linguistics (Coling 2014)


Release of the updated version of “Yoko’s Yummy Recipes” An iPhone/Android translation application designed specifically for translating recipes


Worked as a guest lecturer at Meiji University.


Aided with the 19th Annual Language Processing Academic Convention (NLP2013)