FAQ & Troubleshooting

Common questions and solutions

Frequently Asked Questions

Why is the output different every time?

The AI is nondeterministic. Like a human actor, it gives a slightly different performance each time. Use the Temperature slider to control this variability:

Lower temperature: More consistent, predictable outputs
Higher temperature: More variation and expressiveness

Why does voice selection matter so much?

Voice selection has a tremendous effect on the output. The AI inherits characteristics from the selected voice:

If you want an introspective voice, select a voice that sounds introspective
If you want an energetic voice, select one cloned from energetic samples
If you want a specific dialect, choose a voice trained on that dialect

Remember: Good Input = Good Output

If you provide audio with noise, reverb, or multiple speakers, the AI will be unstable. Always use clean, high-quality source material for best results.

How are credits calculated?

Credits are calculated based on the number of characters in your text and the normalization mode selected:

Basic Normalization: x1 credits
AI-Enhanced Normalization: x2 credits

What audio format is generated?

All generated audio is delivered in MP3 format at 128 or 192 kbps (44.1 kHz) for optimal quality and compatibility.

Is there a character limit?

Mode	Character Limit	Approx. Duration
Emotions & Dialects Enabled	3,000 characters	~3 minutes
Standard Mode	10,000 characters	~10 minutes
Text to Audio (Quick)	1,500 characters	~1.5 minutes

For longer content, use the Text to Audio Studio.

Can I use the generated audio commercially?

Yes, all audio generated with your account can be used for commercial purposes according to your subscription terms.

Troubleshooting

Audio sounds robotic

Try increasing the Temperature setting for more natural variation
Increase the Expressiveness slider slightly
Enable Emotions & Dialects Mode for expressive content
Try a different voice that matches your content style

Output is inconsistent or unstable

Lower the Temperature setting
Lower the Expressiveness slider
Use longer prompts (250+ characters recommended for Emotions mode)
Check that your source audio (for voice cloning) is clean and noise-free

Generation is slow

Longer texts take more time to process
AI-Enhanced normalization requires additional processing
Check your internet connection

Unexpected pronunciation

Use AI-Enhanced Normalization for Arabic text (adds diacritics automatically)
Add manual diacritics (تشكيل) to ambiguous words
Write numbers as words instead of digits
Expand abbreviations (e.g., "Dr." → "Doctor")
Try a different voice - some handle certain words better

Dialect not working correctly

Ensure the selected voice matches the dialect in your text
Use Basic Normalization for dialect content (not AI-Enhanced)
Start your text with a strong dialect word to "prime" the model
Enable Emotions & Dialects Mode

Voice clone doesn't sound like the original

Increase the Similarity slider
Ensure your training audio was high quality (no noise, echo, or reverb)
Check that training audio had consistent tone and energy throughout
If source had noise, lower the Similarity to reduce artifacts

Contact Support

Reach out to our support team at support@moknah.io and we'll help you resolve any issues.