We are on the verge of a computer vision revolution.
It’s reminiscent of the early days of the internet: the promise of technology is clear, but society hasn’t yet seen widespread adoption of it. When we do, computer vision will touch every aspect of our lives.
Consider our daily commutes. Car manufacturers are promising a future in which autonomous vehicles remove the cognitive and temporal burdens that come with driving. That future depends on computer vision.
Thanks to computer-vision-powered “smart carts”, our everyday shopping experience is evolving rapidly. No more waiting in lengthy queues at the grocery store. Shoppers can scan, register, and pay for their items without having to visit the checkout counter.
And what about our health? Recent advancements in computer vision have increased the quality and capabilities of medical imaging so that computers can help doctors spot and diagnose abnormalities such as tumours and stroke indicators.
While it’s hard to predict all of the ways that computer vision will affect our day-to-day lives, it’s even harder to predict the ways in which computer vision will help humans tackle some of the world's most pressing problems. As computing power decreases in cost, machine learning models increase in accuracy, and data becomes more plentiful, organisations are thinking more and more about how best to use computer vision to address large-scale challenges.
As an example, the recent proliferation of satellite imaging has created enormous opportunities to develop computer-vision-based approaches for responding to global challenges, such as coping with natural disasters and changing weather patterns.
Every day, space technology companies are recording thousands of square kilometres of satellite imaging. When a natural disaster strikes, computer vision models can read this data and assess the damage, providing real-time intelligence of what’s happening on the ground from the moment the disaster begins.
Quickly determining the extent of the damage can help international development agencies provide the appropriate amount of emergency disaster relief. Computer vision can help these agencies better understand the resource scarcity caused by a disaster and the types of aid most urgently needed to prevent human suffering. By automating image identification, computer vision models can likewise identify groups of people in satellite imaging, thereby helping search and rescue teams quickly locate people in crisis situations. The alternative method– having humans manually sift through aerial images and identify people in distress– is a time-consuming process that doesn’t align with the urgent action required to save lives after a natural disaster.
Using computer vision to analyse the extent of the infrastructure damage in the event of a natural disaster also provides insight into the total amount of money needed for reconstruction, helping insurance companies efficiently estimate the cost of repair. As a result, insurance claims can be made and paid in a more timely manner than when they rely on a surveyor providing an on-foot assessment of the damage.
When it comes to long-term strategies for coping with natural disasters and weather patterns, computer vision models can help scientists predict the changes that will occur in a particular area– such as increased flooding– as a result of climate change. With this knowledge, governments can make assessments about whether to prohibit people from building in and inhabiting areas where future disasters are likely.
Computer vision could be a game changer for creating new approaches to tackling large-scale challenges such as providing disaster relief. Unfortunately, the AI industry’s reliance on manually labelling data is hindering the technology’s progress.To make the most of this technology, companies need to curate high-quality datasets suited to the model’s use case and label them appropriately. Doing so requires moving away from outdated data management practices.
Data, Data, Everywhere, and Not a Bit to Read
The world has a surplus of data, and it’s increasing all the time. Ninety percent of the world’s data has been generated since 2016. This increase can create incredible opportunities for computer vision applications; however, most of the world’s data is unlabelled, so computer vision models can’t read it. These models need to train on well-curated and appropriately labelled training data. If not properly trained, well-designed models can become useless.
These models also need to train on vast amounts of labelled data so that they can become incredibly confident in their predictions. (Remember these models will run self-driving cars and inform disaster relief strategies.) Acquiring large quantities of high-quality training data is the greatest obstacle for the advancement of computer vision. Unfortunately, most AI companies still rely on the practice of manual data labelling. Manually labelling data isn’t sufficient, scalable, or sustainable; furthermore, the escalation of data generation means the number of human labellers available will soon be outpaced by the amount of data that needs to be labelled.
The Pitfalls of Manual Labelling
Data labelling is a slow, tedious process that’s prone to human error. With a purely manual approach, annotating minutes of video and image data takes many hours.
Many companies provide data labelling services in which they outsource the data to human labellers. However, such outsourcing means losing the input of subject-matter experts into the labelling process, which could result in low-quality training data and compromise the accuracy of the computer vision model. Also, because these jobs often go to people living in developing economies, outsourcing data labelling isn’t a viable scenario for companies operating in any domain where security and data privacy are important, such as healthcare, education, and government.
Some data labelling services offer model assisted labelling, but, to access these services, developers have to jump through a lot of hoops, including running their production models on the data they want to label before they can create and apply labels, which results in time-consuming operational burdens.
Because of the issues associated with data labelling services, many teams build internal tools and use their in-house workforce to manually label their data. However, building these tools in-house often leads to cumbersome data infrastructure and annotation tools that are expensive to maintain and challenging to scale.
So what’s to be done?
For starters, we’ve got to acknowledge that the current manual approach to data labelling isn’t working.
Breaking Away From Manual Labelling
Data labelling cannot remain a manual process if machine learning in general, and computer vision in particular, are to become ubiquitous technologies
Crowdsourcing, outsourcing, in-house labelling – none of these stop-gap approaches will clear the data bottleneck and unlock the power of AI for solving large-scale challenges. Their shortcoming is that they try to improve upon an inherently flawed system of manual labelling.
They are effectively better wrenches when what’s needed is a power drill.
AI’s looming data needs require new tools, ones capable of scaling. That’s why we designed Encord.
Encord’s computer-vision first platform uses a unique technology called micro-models to automate data labelling. Our platform enables companies to break away from a system of data annotation dependent on manual labour. It automates labelling, running micro-models with only a few pieces of hand annotated data. Then, the micro-model begins to train itself to label the rest.
Encord also embeds its data annotation tools into the platform, so users can access model-assisted labelling without jumping through any hoops or placing any operational burdens on developers. In addition, companies retain 100 percent control of their data, making Encord an ideal solution for companies with data privacy and security considerations.
Flexible ontology–defining a set of features that you are looking for in the data and mapping the relationships between those features– is necessary for using computer vision to solve large-scale problems. Encord enables flexible label ontology, which also allows users to target each micro-model to individual features in the ontology. With Encord, users can define multiple ontologies and then build a separate micro-model to label each different feature in the ontology.
Supporting flexible ontology results in more advanced computer vision capabilities because it allows models to express greater complexity. To design complex computer vision models, users need to be able to construct a rich ontology. For instance, when determining the amount of damage caused by a natural disaster, a computer vision model needs to be able to identify the type of infrastructure and then identify whether the infrastructure has suffered damage. To determine the number of houses damaged by the disasters, users would build an “infrastructure damage” model and build a “house detector” for that specific feature.
By combining many micro-models together, data engineers can ask nested questions and obtain granular ontologies that increase the usefulness of models for real-world use cases. For example, after building a micro-model to determine whether a natural disaster damaged a building, a data engineer could construct a micro-model to determine whether street flooding occurred nearby. By linking these two micro-models together, the engineer could gain a better sense of the overall infrastructure damage for that particular area.
With well-trained models and data-centric AI, companies can transform the promise of computer vision into reality. They can build smart cities, streamline manufacturing, and develop cancer detecting devices. They can build models that monitor climate change, predict natural disasters, and help increase food security.
But to achieve that reality, companies must break with their unsustainable and unscalable labelling practices and embrace new tools designed for the future of AI.
Put simply, Synthetic data is information – aka data – that’s been artificially manufactured rather than having been captured via real-world events.
Last month, Encord was one of a number of global tech companies invited by Amazon Web Services (AWS) to attend the event dubbed Project Stormcloud.