Google has been quietly enhancing Bard and including new options each few weeks, bringing its capabilities as much as par with ChatGPT. Now, the corporate has added the flexibility to add pictures to Bard for a much wider expertise apart from textual content. Make no mistake, Google Bard continues to be a text-only giant language mannequin. Nonetheless, the search large has built-in Google Lens, reverse picture search, and some VQA methods (Visible Query Answering) to make Bard really feel like a multimodal mannequin. Nonetheless, Bard’s present imaginative and prescient functionality is certainly considerably stunning, and now we have examined it beneath to study its capabilities. On that word, let’s check out some cool examples of picture uploads in Google Bard.
The most effective utility of Bard’s image-handling capability is that now you possibly can add a picture by clicking on the (+) button. It could then rapidly seize texts from the uploads. Google Bard then routinely performs OCR and does an accurate job. That being said, despite a long list of language support in Bard, currently, the OCR functionality only works for the English language. I tried multiple international and regional languages, but it failed to grab texts from scanned images. Nevertheless, for quick text extraction from images, Bard can be very helpful.
We all struggle when we have to extract tables from scanned images or documents. However, Google Bard can effortlessly extract tables with the formatting intact. In fact, you can export the table to Google Sheets as well and do further editing or data crunching. How cool is that? Having said that, currently, Bard hallucinates a lot, and in some cases, it fills the cells with the wrong data, so make sure to verify them before exporting it.
3. Generate Code for Websites/ Apps Using Mockups
To showcase GPT-4‘s multimodality feature, in March 2023, OpenAI demonstrated how its model understood the scribbled note and quickly created a mockup of the website from a piece of paper. While the multimodal feature is yet to come to GPT-4, Google Bard is able to generate code that matches the mockup. Keep in mind that Bard is not a multimodal model but uses image segmentation via Google Lens to understand the image. Nonetheless, Bard surprised us with its results.
I uploaded a screenshot of the Facebook landing page, and it quickly generated code in HTML and CSS that looked somewhat similar. I also uploaded an image of a simple website that I drew on paper, and Google Bard did a good enough job of recreating it. Further, you can use similar methods for recreating UIs for smartphone apps and other websites as well.
4. Google Bard Can Explain Images
Google Bard is good at explaining images and summarizing what is going on in them. You can upload obscure images, and it can produce reliable information quickly. I uploaded a low-quality image of a biological mechanism, and it correctly identified it as Cell Mitosis. It further explained the process step by step.
In another example, I uploaded a chart, and it correctly understood the image and explained the data. It even created a table of the data points so that I could work on it in Google Sheets. Particularly for students, Bard can be helpful in understanding concepts in science and other topics. You can simply upload an image and ask Bard about it.
5. Get Nutritional Information from Images
Using Bard’s image-handling capability, you can get the nutritional values of food. Simply upload the image of food on your plate, and it will calculate the total calorie within seconds. This can be immensely helpful for people who are on a regulated diet.
In my testing, it couldn’t gauge the portion size but gave examples so that you could calculate the total calorie intake by yourself. It seems Google is using image segmentation to categorize food items and come up with nutritional information.
6. Improvise Food Recipes
Another excellent use case is to add the image of raw food items and ask Google Bard to come up with various food recipes. You can also add images of food items in your refrigerator, and it will effortlessly create personalized recipes for you. Furthermore, you can ask Bard for particular cuisines from various parts of the world. And if you are on a diet, you can ask Google Bard to create fat-free, low-calorie food recipes for satiety.
7. Solve Mathematical Questions
You can use Google Bard to solve mathematical questions as well. You can upload an image of your maths problems to Bard, and it will try to solve the question for you. In my testing, Bard’s approach was right but due to notation issues, it came up with wrong answers only. I think it will require an update to its vision system to make Bard more suitable for handling mathematical notations and questions.
8. Explain Memes and Jokes
Google Bard can also explain memes and jokes. You can upload images of funny memes and cartoons and ask Bard what is funny about the same, and it will provide its own interpretation. I uploaded the same image that OpenAI demonstrated during the GPT-4 unveiling, and Bard rightly understood the hilarious absurdity behind the image.
In another instance, I uploaded an image to Google Bard from The New Yorker Cartoons and asked it to explain the joke. However, this time, it simply explained the scene and couldn’t tell why the image was funny. It entirely missed the email phrase that is commonly used in workplaces. I will suggest you try Google Bard yourself and check if it’s intelligent enough to understand wit and humor.
9. Translate Equations to LaTeX
It’s no secret that many people find it hard to write in LaTeX and prefer to use word processors. However, for scientific research papers and academic writing, LaTeX is required for adding complex equations and high-quality typesetting. In such a scenario, Google Bard can be helpful. You can add images of equations, and Bard can translate them to LaTeX code. That’s amazing, right? So, go ahead and translate the equations to LaTeX code in no time.
10. Upload Medical Reports and Ask Questions
Finally, you can upload images of your medical reports and scan them to Google Bard. You can then ask medical questions based on them. Some physicians on Twitter have shown that Bard is quite decent for differential diagnosis. It can also help users to understand their health and make sense of medical reports
That stated, do take into accout Google Bard is operating on a general-purpose LLM known as PaLM 2. The search large has developed a separate medical-domain Med-PaLM 2 mannequin, which is sort of correct and superior, however it’s not obtainable to basic customers but. So I’ll suggest customers avoid any sort of self-diagnosis utilizing Bard. It’s strongly really useful to seek the advice of a physician. And eventually, for those who add your private medical stories to Bard, make certain to delete Bard chats to guard your privateness.