Skyler Liu
SUNO AI MUSIC VIDEO
2025
In this project, my design challenge is to create a short promotional animated video that enhances the creative song-writing capabilities of Suno while visually enhancing the audio with AI generated animation.
Softwares
AI TOOL
Adobe Premiere
Photoshop
Suno AI
Midjourney
Keling
Vizcom
Chatgpt


01
Research: Suno AI
Suno, Inc., is a company that provides AI-generated music services. Users can create their own music using prompts and music references.
Prompt format: [Genre] song, [vocal style], [mood/emotion], [instruments/sounds], [BPM/tempo], [Length],[structure hint], similar to [artist/style reference].
If add lyrics, the length of the lyrics will affect the duration of the song.
02
Concept
AI is a powerful tool, like a mirror that reflects the user’s thoughts while also expanding upon them. Yet AI can sometimes push the input to extremes. People should never follow AI blindly, but instead hold on to their own independent thinking.

03
Music Generate
Since the concept follows a fable-like style, my musical approach is to compose in the form of a medieval folk ballad or Celtic chant, drawing on near a cappella modal structures (Dorian or Aeolian).
Lyrics
[Verse 1]
Input a whisper, echoes turn to song
Your voice reflected, a thousand outcomes strong
[Chorus]
Answers are echoes, not the only truth you see
Ideas may extend, but the choice belongs to me
[Bridge]
Follow only shadows, and yourself you may lose
Hold on to your own voice, the tool helps what you choose
[Outro]
The mirror may reply, but the future starts with I
Reference
Scarborough Fair
Prompts
Medieval folk ballad, ethereal female vocal with choral backing, mysterious and introspective mood, harp and lute with subtle flute and drone textures, slow tempo ~70 BPM, verse–chorus–outro structure, similar to Simon & Garfunkel’s “Scarborough Fair” and Celtic traditional chants, no longer than one minute
04
Storyboard Generate
At this stage, I primarily use MidJourney, as its stylization capabilities are highly effective. With an extensive sample library, it provides strong control over stylistic consistency, ensuring a cohesive visual outcome.
Prompt format: [Subject]A girl with long straight white hair, center-parted, black eyes, and a flowing white nightgown, [ Action /Description], [ Environment]a luxurious white European-style room, with a surreal and dreamlike atmosphere, [Style]Artistic and abstract, not overly realistic 2.5D Final Fantasy next-gen character style, [Lighting]Soft side lighting, HDR photorealistic rendering, pearlescent reflections, dark ambient glow, glowing mist, digital noise and glitch textures, golden specks, spatial layering, cinematic composition, [Camera], [Reference] --[Index]ar--16:9









During the image generation process, I found that MidJourney’s output logic feels almost like a gacha system—its randomness is far too high, making it impossible to achieve my ideal composition through prompts alone. On the recommendation of an art director, I tried to first generate the desired composition and poses in Vizcom, and then transfer them into MidJourney to rerun with a unified style.
The new workflow is as follows: I begin by posing a mannequin in AnyPoses.com to establish the desired action, then use Vizcom to render a high-quality pose reference, and finally import this reference into MidJourney to guide and control the generation of images.

[Outputs before optimizing the generation workflow]
Prompt: Side view of a girl with long straight white hair, center-parted, black eyes, wearing a flowing white long nightgown. She stands on the left side of the frame, raising her fist and swinging a powerful punch toward a vintage ornate full-length mirror on the right side. Her posture is dynamic and full of tension, captured mid-motion.




anyposes.com

Vizcom

MidJourney
05
Animation Generate
In this part, I use Keling to animate the storyboard. Keling’s generation logic is based on inputting the first and last frame images (or the first frame alone), and then completing the intermediate frames according to the prompt. Its generation quality is very high, reliably preserving the elements and style of the input image, while producing character movements that appear natural, with minimal noticeable errors.









Between the first and second versions of the animation, several changes were made to improve narrative continuity and create smoother visual effects:


06
Problem Solving
This is my first project created entirely with AI. During the process, I encountered the following challenges:
1.Audio with Suno AI
Suno AI often generated audio that exceeded the desired length, even when I specified “no longer than 60s” in the prompt. Fortunately, its built-in edit tool allowed me to trim the track, and if transitions became uneven, I could regenerate only the problematic segment. Then I sped up the audio in Premiere. Finally I reduced it from 1:26 to 56 seconds.

2.Image control with MidJourney
Because of MidJourney’s inherent randomness, visual consistency required attention to both prompts and references.
Prompts: Once the style was set, certain keywords had to remain consistent, and strict formatting ensured effectiveness. For example:
Subject: A girl with long straight white hair, center-parted, black eyes, in a flowing white nightgown
Environment: A luxurious white European-style room, surreal and dreamlike
Style: Artistic and abstract, not overly realistic, 2.5D Final Fantasy next-gen character style
Lighting: Soft side lighting, HDR photorealistic rendering, pearlescent reflections, glowing mist, glitch textures, golden specks
Adding a high-quality index at the end often improved clarity. I relied on ChatGPT to draft these prompts, since it consistently maintained structure and precise wording.
References: MidJourney offers three types—image prompts for composition and action, style references for lighting, and omni references for character features. The roles of these three elements are interdependent rather than strictly separated, as MidJourney’s algorithm blends the information from all three references into a unified output.




07
AI Tools
The following AI tools were used in this project:

Suno AI (text-to-audio): Its strength lies in being a specialized music-generation tool capable of producing high-quality audio. Even a few casually supplied prompts can yield results that sound as if they had been polished for months, as its minimum generation quality is remarkably high. However, creating more refined results often requires a certain level of musical knowledge.

MidJourney (text-to-image): Known for its high image quality and vast, continuously updated sample library, it can satisfy diverse image-generation needs. Its drawback is high randomness in outputs. Recently, new features—text-to-animation and image-to-animation—have been introduced, though the animation function remains limited in flexibility and demonstrates weak comprehension of camera movement prompts.

Keling (image-to-animation): Produces high-quality animation with strong prompt comprehension and flexibility. Character movements are usually natural and free of major errors (e.g., extra limbs). The animation rhythm is well handled, with built-in easing, and it integrates DeepSeek to assist users in writing prompts. Its current limitation is the fixed duration—only 5s or 10s clips are available. And I have to say it's a bit expensive now. Overall, however, it is already a very capable tool.

Vizcom (image-to-image): Strong in content recognition, making it well-suited for rendering hand-drawn sketches into polished images. Its stylization features, however, are limited and do not support custom styles.

ChatGPT (text-to-prompt): Outstanding in natural language generation and the most effective AI language tool I have used, making it invaluable for crafting structured and precise prompts. Greatness speaks for itself.
Through this project, I learned the workflow of creating animation with AI and further refined my skills in AI-based image generation. The tools involved in this process proved remarkably powerful, inspiring deep appreciation for the pace of technological progress. Moreover, they continue to evolve rapidly.
I believe AI will play a vital role in our future, touching nearly every aspect of our lives. Yet we must remember not only to keep pace with the times, but also that technology is merely a tool—human beings remain the foundation.