It’s Showtime recently held a segment paying homage to Filipino cinema’s veterans of comedy. The musical segment featured the trio of Teddy Corpuz, Vhong Navarro, and Jugs Jugueta performing bits from comedians of the past primarily Redford White, Babalu and the Comedy King himself, Dolphy.
The performance was quite the warm tribute, with the artists performing the gestures and signature facial expressions that made Dolphy and the rest dear to our generation. Little did we know that It’s Showtime would excel far beyond its timeslot contemporaries in using generative AI by leveraging full facial swapping on its live broadcast.
The show sidesteps any mention of the technology in the show, focusing on the production at hand but the mention of later part of the segment as “first in Philippine history”, referring to all three artist under their face-swapped persona, to greet the audience together as a trio (as Dolphy, Redford White, Babalu).
That in itself was a big rush of nostalgia and anyone who were around in the 80s, 90s and the early 2000s would be familiar with many of the faces of actors and actresses that have made us laugh over the years, brought to life once again via generative AI.
It was a powerful moment, one that brought some to tears and others in awe. And for us people in the technology industry, perhaps some mixed emotions, but it is undeniable that this is a different style of fan service.
Comedy history and nostalgia aside, in this article we’ll talk about some of the details that made this segment possible, via the use of modern deep learning techniques. We’re going to talk directly with the people that made this technology available, the challenges of the project and the computing power needed to make this possible on a live broadcast.
Generative AI and the Power of Deepfakes
Generative AI, a rapidly evolving field in artificial intelligence, has made significant strides in creating realistic digital content, notably through the development of deepfakes. Deepfakes are synthetic media where a person in an existing image or video is replaced with someone else’s likeness, often using machine learning techniques. This technology has found applications in various sectors, including entertainment, where it allows for the creation of highly realistic visual effects and the potential to bring historical figures to life in film and media as demonstrated by the It’s Showtime segment, bringing to life actors and actresses from the past for their tribute. It also offers practical uses in education and training, providing immersive, interactive experiences that can enhance learning.
However, the power of deepfakes comes with inherent risks and potential consequences. The ability to create convincing fake content can lead to challenges in distinguishing between real and synthetic media, raising concerns about misinformation and its impact on public opinion and trust. This has prompted discussions about the ethical implications of deepfake technology and the need for robust detection methods. The development of deepfake detection tools is an ongoing area of research, aiming to safeguard against the misuse of this technology while preserving its innovative potential. People we’re in touch with for this article have confirmed that It’s Showtime sought permission to use the likeness of the actors and actresses before proceeding with the segment and that members of the audience that day included families of those featured.
In closing, even with challenges, the benefits of generative AI and deepfakes are significant. They offer transformative possibilities in creative industries, enabling filmmakers and artists to push the boundaries of their craft. This has been a key talking point in the recent SAG-AFTRA negotiations which have now concluded. In other industries like personalized marketing and customer service, deepfakes can create more engaging and tailored experiences. As the technology continues to evolve, it is crucial to balance its innovative applications with responsible use and awareness of its implications. This balance will ensure that generative AI continues to be a powerful tool for positive change and creativity, while minimizing potential risks to society.
The Technology Behind It’s Showtime’s Tribute
I wanted to talk about the segment and its premise first because it actually becomes simpler once we go into the actual process of making it possible. I also wanted to be as transparent about my stance on this segment as I personally feel we’re still in that stage that some folks just outright discount the merits and application in the name of privacy. That said, in this case the usage is simpler and very straight forward.
I won’t go into detail in explaining the technical details of how a deepfake is done but for reference, we all know the early days of the internet where we just replace someone’s head with someone else’s. After all, that’s where Photoshop reached its mainstream popularity. Deepfake operates in a similar principle but takes the manual labor, essentially teaching the computer to recognize key points in a face and match it a replacement face.
All the while, the replacement face is also fed to the model the learn the various angles of the face. This allows even single photos to have various angles despite only having a single photo reference.
This task is very computationally demanding but luckily, modern hardware has reached a point where we don’t need an entire room of computers to do the job, we also have reached the point where newer techniques have reached the point where we only need a single reference image.
With that out of the way, we’ll first talk about the software that makes this possible, for this we’ve reached out to the person that made this possible: Kenneth Estanislao, AI specialist.
Back2Gaming: Hi Kenneth, thank you for taking time to share with me about this project. Let’s get right to it – can you share details on you achieved this production.
Kenneth: It all started with a friend of mine having created a viral video on Tiktok that involved video restoration. That led to one of the talent producers of ABS-CBN contacting him after which I was referred to the show’s team.
It’s Showtime’s showrunners called me and ask me to go to the studio to demo the technology. I showed the fork I created on roop, which is roop-cam (https://github.com/hacksider/roop-cam) which uses the open source InsightFace technology then interfacing it with OpenCV to make it real-time. Then after they saw it, the rest is history. They got excited about it and asked for the requirements of the technology for use in their production.
I clarified that the technology is still in its infancy so to make it happen in real-time on national broadcast, each of the performing artist should have their own dedicated computer.
We did some testing to ensure that everything will run just fine during the live show. I did some adjustments just for them to ensure everything will go smooth as this is my baby project that will be showcased on national TV.
This tool is given by the community
Back2Gaming: What were the challenges you faced and things you had to consider?
Kenneth: Traffic? And some hardware limitations. We still can’t achieve a near perfect real-time audio/video sync as there’s always post-processing in the background thus you can notice a 2 to 3 frames delay on live TV.
Back2Gaming: Nova Villa walked into frame in the closing segment haha is that something you can improve on in future production? Allow others to walk into a scene without getting faceswapped?
Kenneth: It’s already being developed and is already done for recording use but currently not on live mode. We always love challenges as an IT professional, but maybe soon it’ll also be available on live mode (this tech has only emerged this July 2023.)
Back2Gaming: Now I know you’ve been following Back2Gaming for a while now, and you’ve told me many times to do this wayyyy wayyy before this show. But for those reading this, how hard would it be to do this even at home?
Kenneth: Requirements is to know how to install Python and some of its modules and knowing how to navigate GitHub. I’m still in the process of making it a simple package but there’s always distraction (my work and DOTA 2).
Back2Gaming: How computationally demanding would this be? What was the hardware used for this project in the actual show?
Kenneth: AI currently performs best on the GPU. So for a live show/production, I told them to get an NVIDIA GeForce RTX 4080 and at least 32GB RAM (PC Express’ was nice enough to provide 64GB). I also told them to have something preferably with an Intel Core i7 but at least a Ryzen 7 would be ok to ensure we don’t get hiccups. But this is me just maximizing ABS-CBN’s resources just to be on the safe side during production, hahaha!
For a home user though, an 8GB GPU (NVIDIA with CUDA will always be optimal), then 16GB RAM, 8GB storage drive space available for installation. This is enough for any LIVE Sessions. But if you’re just planning on post-processing for a recording, then you don’t really need a discrete GPU (but having one will always be faster) along with 8GB RAM, and 8GB drive space for the installation. Any CPU will usually do from the past 5 years, render times will vary though.
Back2Gaming: Pretty much off-the-shelf items that, provided a budget, anyone can get. Anyway, before we close, do you have anything to say to our readers interested in this field and also for those that are critics of this technology.
Kenneth: This tool is given by the community for us to take advantage along with anything else in the AI space and they’re not meant to destroy any jobs/person. Still, be responsible on what you do with it online.
But hey, most of the AI developments we’re getting right now (LLMs, GANs, etc.) are created by some kid that started wanting to do his written projects quick or want some prank photos but then discovered they could something more productive with the technology, hahaha!
Back2Gaming: Thank you for your time, Kenneth. Congratulations again on this project and the team that made this possible.
Talking more about the hardware used in the It’s Showtime production, Kenneth and the production team used Powered by ASUS PCs provided by PC Express which had the following configuration:
PCX Centaur Studio i7 (modified for It’s Showtime):
- Intel Core i7-13700F
- ROG Strix B760-F Gaming
- TUF Gaming RTX 4080
- 64GB Kingston Fury Renegade DDR5-6000
- 1TB Kingston NV2
- TUF Gaming LC II 360 ARGB
- TUF Gaming 750W Gold
- TUF Gaming GT502
This high-performance gaming build powered the tribute segment from It’s Showtime, built around an ASUS TUF GAMING GeForce RTX 4080 graphics card and features a complete Powered by ASUS TUF GAMING build. PC Express collaborated with It’s Showtime for this segment, showcasing the power of gaming PCs but not for gaming in this case, but a complete production solution that has been viewed by over 3 million times on all channels.
Take note that the show unit is slightly modified option offered by PC Express. You can, of course, tailor the build in-store but the specific package is called the PCX Centaur Studio and is offered in configurations in up to an Intel Core i9 and an RTX 4080.
You can get this build from any PC Express outlets or their online store along with more builds including various ASUS and ROG builds for all budgets. Shop individual components, office computing needs, laptops and more at PC Express. Whether you’re a gamer, business owner,
Visit your favorite PC Express outlet, visit their website or join the PCX Viber community for more information.
How to use DeepLiveCam
Once you have the hardware ready, all the tools have been made available by Kenneth on GitHub with the links on the end of this article. You will need a webcam or capture device for cameras to get your feed to roopcam or DeepLiveCam and then all you need is to output it, either directly live or to a mixer.
The basic premise to this features face scanning technology which allows the software to recognize faces. A subject face (the target) is then mapped and is then matched to a donor face (the face to be swapped in). This used to be a process where you need to train with many facial angles but Kenneth’s implementation only requires a single image to reconstruct a donor it to a target face.
Kenneth’s DeepLiveCam tracks the target face, actively reconstructing the target regardless of angle and will also fully apply real-world lighting on the reconstructed face as demonstrated below:
The software has built-in harnesses that will not allow users to run this on objectionable content. Like the show’s production, users are urged to secure permissions for likenesses when using the software.
And while the show skipped on the vocal part completely, Kenneth did lead me to resources for vocals that can replicate voices, mannerisms in speaking as well as preferred pronunciation and more. Its not Mission Impossible-level real-time but the quality is definitely there.
The weight of that segment resonates best with the people that experienced that era but if there’s something that the younger generation can take it from it, is that those are gone but with the power of technology coupled with talented performers working with a talented production team, proper use of face-swapping technology and generative AI can deliver a tear-jerking moment.
All of it, running off gaming PCs.
We’ve come a long, long way my fellow gamers and hopefully you’ve picked something up from our discussion with the man behind the technology. You can check out more of Kenneth’s work at his GitHub page below. Also do check out the Powered by ASUS build at PC Express.
All the links to the software used are open source and are available at Github with the links below:
- DeepLiveCam: https://github.com/hacksider/Deep-Live-Cam
- roopcam: https://github.com/hacksider/roop-cam
- InsightFace: https://github.com/deepinsight/insightface
PC Express Centaur Studio i7 PC and other Powered by ASUS builds:
ABS-CBN will have a special covering the segment on Tao Po by Doris Bigornia this Sunday at 2:15PM on A2Z and 6:15PM on the Kapamilya Channel and Kapamilya Online Live.