Connect with us

Technology

Artificial intelligence tool turns audio into video

Published

on

Barack Obama. - Photo by Pete Prodoehl
Barack Obama. - Photo by Pete Prodoehl
Share this:

A new artificial intelligence tool can create realistic videos from audio files alone.

The technology, developed at the University of Washington is based on new algorithms designed to turn audio clips into realistic, lip-synced videos of the person who is speaking the words by searching for and learning from existing videos on the Internet or elsewhere.

How was this tech created?

Creating the tech involved first training a neural network to view videos of an individual and then to translate different audio sounds into basic mouth shapes. Then, researchers used a new mouth synthesis technique to realistically superimpose mouth shapes and textures onto an existing reference video of a given person.

To test out the technology, the research group generated a realistic video of Barack Obama discussing such diverse subjects as terrorism, fatherhood and employment. The video was created using audio clips alone together with a separate video image of the former president. The video overcomes a major problem with adding audio to video, where the mouth of the speaker appears unrealistic.

The video below shows the outcome of the study:

Discussing the outcome, lead researcher Professor Ira Kemelmacher-Shlizerman enthused: “These type of results have never been shown before.”

The white paper based on the research is called “Synthesizing Obama: Learning Lip Sync from Audio”.

Share this:

Business

5 tips for brainstorming with ChatGPT

How to avoid inaccuracy and leverage the full creative reign of ChatGPT

Published

on

Share this:

ChatGPT recruited a staggering 100 million users by January 2023. As software with one of the fastest-growing user bases, we imagine even higher numbers this year. 

It’s not hard to see why. 

Amazon sellers use it to optimize product listings that bring in more sales. Programmers use it to write code. Writers use it to get their creative juices flowing. 

And occasionally, a lawyer might use it to prepare a court filing, only to fail miserably when the judge notices numerous fake cases and citations. 

Which brings us to the fact that ChatGPT was never infallible. It’s best used as a brainstorming tool with a skeptical lens on every output. 

Here are five tips for how businesses can avoid inaccuracy and leverage the full creative reign of generative AI when brainstorming.

  1. Use it as a base

Hootsuite’s marketing VP Billy Jones talked about using ChatGPT as a jumping-off point for his marketing strategy. He shares an example of how he used it to create audience personas for his advertising tactics. 

Would he ask ChatGPT to create audience personas for Hootsuite’s products? Nope, that would present too many gaps where the platform could plug in false assumptions. Instead, Jones asks for demographic data on social media managers in the US — a request easy enough for ChatGPT to gather data on. From there he pairs the output with his own research to create audience personas. 

  1. Ask open-ended questions

You don’t need ChatGPT to tell you yes or no — even if you learn something new, that doesn’t really get your creative juices flowing. Consider the difference: 

  • Does history repeat itself? 
  • What are some examples of history repeating itself in politics in the last decade?

Open-ended questions give you much more opportunity to get inspired and ask questions you may not have thought of. 

  1. Edit your questions as you go

ChatGPT has a wealth of data at its virtual fingertips to examine and interpret before spitting out an answer. Meaning you can narrow down the data for a more focused response with multiple prompts that further tweak its answers. 

For example, you might ask ChatGPT about book recommendations for your book club. Once you get an answer, you could narrow it down by adding another requirement, like specific years of release, topic categories, or mentions by reputable reviewers. Adding context to what you’re looking for will give more nuanced answers.

  1. Gain inspiration from past success

Have an idea you’re unsure about? Ask ChatGPT about successes with a particular strategy or within a particular industry. 

The platform can scour through endless news releases, reports, statistics, and content to find you relatable cases all over the world. Adding the word “adapt” into a prompt can help utilize strategies that have worked in the past and apply them to your question. 

As an example, the prompt, “Adapt sales techniques to effectively navigate virtual selling environments,” can generate new solutions by pulling from how old problems were solved. 

  1. Trust, but verify

You wouldn’t publish the drawing board of a brainstorm session. Similarly, don’t take anything ChatGPT says as truth until you verify it with your own research. 

The University of Waterloo notes that blending curiosity and critical thinking with ChatGPT can help to think through ideas and new angles. But, once the brainstorming is done, it’s time to turn to real research for confirmation.

Share this:
Continue Reading

Technology

Multimodal AI promises transformative changes to health care delivery

The health-care industry is slowly adopting first-generation AI technologies, but with multimodal AI on the horizon, it’s time to get ready for what’s next

Published

on

Share this:

Multimodal AI is the next big thing in health care. Except the previous next big thing — regular ‘garden variety’ unimodal AI — hasn’t even fully arrived yet.

“In health care, AI is a promising technology that has been deployed in very small pockets here and there, but it has not broadly impacted health care in a meaningful way,” said Dr. Amol Verma, a physician, scientist, Assistant Professor in General Internal Medicine at St. Michael’s Hospital and the University of Toronto and the 2023 Temerty Professor of AI Research and Education in Medicine at the University of Toronto. 

“The vast majority of clinicians don’t really use AI in their clinical practice.” 

That fact doesn’t mean we’re not at the outset of a multimodal AI revolution in health care, with transformative changes including diagnostics, enabling remote care, and driving efficiencies in an industry not exactly known for them.

But before we get too far ahead of ourselves here — what exactly is multimodal AI?

Multimodal AI systems are capable of processing and synthesizing multiple forms of data in order to provide outputs that include decisions, recommendations, and predictions. 

Just like people.

“We are multimodal,” said Elham Dolatabadi, a data scientist and Assistant Professor at York University focused on machine learning in health care. “We can smell, see, and hear. These are all different modalities. Our brains integrate them to come to one decision.”

So, where a unimodal AI system in health care might only be able to assess MRI scans, a multimodal AI system could process the information from that MRI scan, along with clinical notes, lab tests, genomic data and real-time patient health tracking information, amongst other inputs. In this way, a multimodal AI system would replicate the way actual doctors process patient health information — by considering and analyzing multiple sources of data.

What multimodal AI could mean for health care

Multimodal AI systems could bring almost unfathomable computational power to health care, especially when it comes to diagnostics and predictive capabilities. This could mean a much more personalized and precise approach to health care for individual patients. Think earlier, more accurate diagnoses and better outcomes — all at scale. 

For example, a computer could process radiology images and incorporate that data into its predictions to make them more accurate. It could also incorporate actual conversations between a patient and physician and even assess clinical deterioration from the sound of a patient’s voice.

Multimodal AI could also unlock the potential of telemedicine. Right now, health care providers are limited to assessments based on their conversations and observations over video calls. But if a patient had sensory technology in their home capturing their personal data and feeding it into a multimodal AI system, that could change health care dynamics for millions of patients. 

“It could dramatically improve access to care for people who are far away from clinical care or who, for whatever reason, have a hard time getting out of their house,” said Dr. Verma.

Wait times could also be improved. For example, patients regularly face wait times for ultrasound imaging. That’s partly because operating an ultrasound machine is what’s called ‘operator dependent.’ A clinician needs to be properly trained to use it. But AI could actually help patients capture their own images if it was embedded in a device they could use themselves.

“It could change who can deliver health care, not just who receives it,” said Dr. Verma.

Ethical considerations remain central to AI deployment in health care

For many, the widespread use of AI in health care still seems risky. What about privacy? What about the risks of entrusting critical patient care decisions to AI? 

Shaina Raza is an Applied Machine Learning Scientist for Responsible AI at the Vector Institute, an independent, not-for-profit focused on AI research. Her work focuses on the ethics of responsible AI in public health.

“With critical decisions about the life or death of a person, the doctors should make those decisions,” said Raza. “It’s not appropriate for generative AI models to do that. But if AI is used to facilitate research or help the doctors’ decisions, that’s different. We can save hundreds or thousands of hours that way.”

Raza notes that ethical AI in health care is ultimately about humans creating the frameworks for the AI before it’s ever fully deployed. That’s how best to address issues like patient privacy and systemic biases. 

“Patient privacy is very sensitive. We need to de-identify or mask patient data before feeding it into the AI models.” said Raza. “We can also clean the data for biases before we feed it into the models, with what we call prompt engineering, the instructions we provide to the models.”

Once these sticky issues are addressed, multimodal AI will likely have revolutionary, positive impacts on our health-care system. But since the health-care industry is typically slow to adopt new technologies, that revolution may have to wait a few years.

This might actually be a good thing, according to Dr. Verma.

“Technology is deployed much more widely in the general society than it is in medicine,” said Dr. Verma. “We’re talking about a profession that still frequently uses fax machines. I can do more sophisticated things on my mobile device than I can for applications that are medical-specific. That’s a good thing because the stakes in medicine are very high — literally life and death. We don’t want to be deploying unproven technology so rapidly in that context.”

So while multimodal AI isn’t yet ready for broad deployment, the industry is still preparing for what’s next.

“Our aim is to include as many modalities in the models as possible, including images, text, electronic medical records, wearables, signals like ECG and EEG, and genomics,” said Dolatabadi, who is currently conducting research in multimodal learning with generative AI. 

“Then the models could be used for different applications. And once the models are built, hospitals or health-care organizations can fine tune them for their own applications and patient populations.”

How health-care organizations and providers can prepare for the age of multimodal AI

With multimodal AI a matter of when and not if, it’s incumbent upon everyone in the health-care industry to be prepared for the changes ahead. 

Dr. Verma outlines four key key areas of focus over the next few years: 

  1. Design the right systems to safely deploy AI technologies. “We need to be designing the right institutions, relationships and incentives. That means creating, effectively, centres of excellence with the infrastructure, data and skilled personnel that can monitor AI technologies. These centres would then connect to primary providers, who you can’t expect to assess whether an AI solution is working. That’s just not feasible or practical for them.”
  1. Create a plan for collecting critical data to avoid exacerbating system biases. “We know systems are prone to bias, so the second thing that must happen at a system level is creating a plan to collect data about patient race, language and gender, which we currently do not collect. If we don’t collect that information, we won’t know whether solutions are biased and will therefore perform poorly. We could end up exacerbating biases in our health-care system.”
  1. Building up AI expertise, skills and organizational capacity. “Organizations need to identify AI champions, either through recruitment or upskilling. They need people that have technical skills and skills related to the legality and ethics of AI. And they need people that have skills related to the change management aspects of AI implementation. There’s also the question of scale. Not every organization can do this. Big organizations should scale up and smaller organizations that can’t should partner with big organizations that can.”
  1. Get educated on AI technologies. “I think at the individual practitioner level, people basically just need to become more aware and educated through professional development. Some basic understanding of these technologies would be good.”

Despite all the complexity with AI, there are options for more people to live longer, healthier lives if multimodal AI is deployed across the health-care system. The fact that health-care organizations are still working to embed unimodal much less multimodal AI doesn’t change the trajectory. In the coming years, almost all of us will find our health care experiences enabled by AI in ways both obvious and hidden. It’s the inevitable next step.

Share this:
Continue Reading

Technology

New IT infrastructure for Gilbane includes LLMs, payment automation, and unified tech departments

A look at 153-year old real estate titan Gilbane’s digital transformation

Published

on

Share this:

Canada’s construction industry has been slower to get on the digital transformation train. Yet virtual design and automation presents serious opportunities for the industry to evolve with digital transformation. 

Research shows that more than 80% of construction companies have room to improve their data capabilities, and the most commonly cited benefits of digital transformation were increased productivity, customer experience and staff safety.

One company that made it happen is Gilbane — a U.S.-based real estate development company worth more than $6.5 billion, with family roots that persevere to this day. In fact, they even have employees in the family’s sixth generation. Despite the “legacy” mindset in the company’s fabric (it’s 153 years old) and business industry, Gilbane boosted risk management and productivity under a brand new modular IT infrastructure. 

Here’s how they did it (and how you can, too):

Unify all digital initiatives under one leader

The company created a position that was responsible for overseeing the progress on multiple tech initiatives, including AI, analytics, IT, and cybersecurity. This helped them avoid the risk of wasted budgets through silos and miscommunication. 

“We believe this structure is the most effective to bring together our data and technology resources to drive transformation and get a real return on invested capital.”Karen Higgins-Carter, Chief Digital Information Officer (CDIO)

Invest in analytics and AI for risk management

Safety is a primary concern in the construction industry. Despite improvements in safety measures, equipment, and training, the construction industry still experiences high rates of death and injury. In fact, in 2022 the National Safety Council ranked the construction industry in the top four most dangerous, noting that it experienced the most workplace deaths.

Gilbane’s team is investing in analytics and AI with large language mode experiments to help them identify similar trends that indicate potential unsafe characteristics on a worksite, Higgins-Carter told CIO. “In construction, our teams are managing the construction of hundreds of projects happening at any one time,” she said. “Our analytics capabilities identify potentially unsafe conditions so we can manage projects more safely and mitigate risks.”

“To help us manage risk, I need to understand the leading indicators of risk on a job, like attrition or high volumes of change order.”Karen Higgins-Carter, Chief Digital Information Officer (CDIO)

Automate payment processing with operation-specific triggers

Higgins-Carter told CIO the company recently piloted an automated payment program for Gilbane to pay subcontractors more efficiently. Powered by videos and photos of work completed as triggers, payments are automatically dispensed to the necessary parties. 

Educate the entire team and inform new processes with their experience

Hold meetings and training sessions to ensure executives and employees  understand the benefits and functions of any new tech or business processes. 

“We can’t deliver technology if we don’t understand our employees’ experience. If I go out to a job site once a month, then my team will too.”Karen Higgins-Carter, Chief Digital Information Officer (CDIO)

Read the full article on CIO here.

Share this:
Continue Reading

Featured