In an era where artificial intelligence (AI) and machine learning (ML) have evolved into near-household terms, Vokal aims to stand out by adding a human touch to its platform. The startup, which began operations in early 2018, is a question-and-answer platform that operates only in vernacular languages. Since its inception, it has garnered an active user base of two million users right now, and offers its platform in 11 Indian languages — Assamese, Bangla, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil and Telugu.
As its founders, Aprameya Radhakrishna and Mayank Bidawatka tell us, much of its growth is actually attributed to the human involvement in both front- and back-end of what it offers, rather than just the technology base. Having raised Series A funding of $6.5 million (~Rs 44.8 crore) from investors such as Accel Partners, Kalaari Capital and Shunwei Capital, Radhakrishna and Bidawatka are bullish on the growth potential of Vokal. Unlike many vernacular apps and services that operate in the entertainment space, they want to tap into the rising factor of aspiration among India’s newest internet users, and become a cross between Google, Quora and Twitter in order to reach 100 million active users by end-2020.
News18 caught up with the founders, who had previously worked together on TaxiForSure (which was founded by Radhakrishna), taking a look inside Vokal, an app with tall but grounded ambitions. Here are the excerpts:
On how it began:
Aprameya Radhakrishna (AR): The first product that we built was a communication service, a voice-first WhatsApp of sorts. Then, we realised that it was too early or too late to do that. We then shifted to a one-minute audio status message like Twitter, where we found that most people did not know what to say, so it didn’t get a lot of traction. We then decided to give people the context to create and publish, which is when we thought of topics and questions as a way to make people think and speak.
On where Vokal fits in:
AR: In the past 25 years, most of the content on the internet is produced in English. If you search on Google in English, you’ll land on extremely relevant search results. But, with the influx of non-English audience, when someone asks a question in Tamil, Kannada or Hindi, the amount of accurate content that is being produced on the internet in these languages is too less. That is the missing piece, and that is where Vokal acts as an engine to enable people to both create content and find answers that were difficult to access.
Mayank Bidawatka (MB): We are in a zone where we plan to enable smart people in India to come and create content that the next billion would love to consume. They are aspirational, and they want to know what the smart people have to say. We are hence a bit of a mix between Twitter and Quora in that sense, but not one particular platform as such.
Keeping the platform purely audio-visual:
AR: Transitioning from a keyboard to touchscreen hasn’t really been a big deal (for us). But, for those joining the internet now, they’ve mostly never used old form devices before. Hence, using a keyboard to input information is an alien usage experience, even more so in a different language. So, we thought of keeping all input in the device as voice or video as the primary communication mode, which feels more conversational.
On the balance of manual and automated curation:
MB: There is a lot of manual curation in the platform right now, and we want to retain that. We have a lot of automation related to a question — our algorithms understand context and semantics well. We then use algos to clean up the question, and use them to categorise it into the right type. We also classify users automatically, so if you start answering science questions, we will be showing you questions related to that topic. This is what lets us make two people meet on Vokal.
AR: When we started, even understanding and tagging the question was manual. Now, that process is 80% automated, and only corrections are manual. We are starting to automate each aspect gradually as we scale up. Exceptions will remain manual, but the Q&A process itself will be automated.
Onboarding the 'experts':
AR: It is important to get the right person to speak on the topic. So that’s the first set of experts, who are based on the knowledge that they have. The second are people who come on the platform and start answering — we do a manual check on the quality of the answers, and then allow them to answer more, going forward. This is a dual-process that works both ways.
MB: We invite folks that are experts in specific subject matters. We then give them relevant questions to answers. Sometimes, a lot of common people on the platform also talk a lot of sense, and they are really good. So we hear a lot of their answers, and push them accordingly.
Staying away from English:
AR: There’s already so much available in the English language. For the English audience, when they search, the primary interaction is not yet audio-video. Whereas, it’s very clear for local languages that audio and video are preferred over text. There is also a dearth of good quality content, which makes answering questions really valuable for the local language audience. Users also feel comfortable interacting with others on our platform. One should be able to clearly express themselves, without feeling inhibited.
On regulating content quality:
AR: Vokal is a platform to express information. We are not here to control anyone’s views on the platform. One, we ensure that there is a quality check right at the start, to ensure that people are producing good quality of content. Second, we also look at reports made by our users on the answers given by fellow users. So, we handle these issues by a rule of exception, rather than regulating everyone and everything that happens on the platform.
We use a standard speech to text mechanism, which makes all the available content searchable. Hence, the initial quality check is done in a traditional, search and sort machine learning model. Then, every answer that comes in gets sampled among the pool of reactions. If there are high like percentage on an answer, it gets taken up and ranked higher in the stream. That is how we monitor it.
On scaling up:
MB: Our product is honestly targeted at the next 100 million, than all of the next billion per se. As time progresses, another bunch will grow in aspiration and maturity. These are the people that will see their peers make progress, and become aspirational in process. It is a small segment we are targeting now, but this is about to become much larger in the immediate future.
We are looking at scaling up to 100 million users by late 2020. Partnerships will be an important part of that — we will tie up with various OEMs, businesses and operators such as Reliance Jio, which has contributed a lot to building the ecosystem. Our product is also very relevant to their users — there is a lot of stuff on entertainment, but nothing that is a bit more serious.
On monetisation plans:
AR: We might monetise in a format that is closer to Google’s — say, if there is a question such as what is the best smartphone to purchase under Rs 10,000, there are a lot of OEMs in the country that would want to own that question as the first sponsored answer. If we, for instance, drive a million views to that answer every day, that is a straightforward way to monetise right there. But we are not looking at an ad-supported revenue model right away.
MB: Our business and operation model is actually quite difficult to pull off. It is completely ground-up, and very slow when it starts off because you are literally convincing one guy at a time, which makes building engagement in the network quite difficult to establish. Our traffic is also pretty much mobile-only, over 95% of users come from there.