FAQ & Troubleshooting
Common questions and solutions
Frequently Asked Questions
Why is the output different every time?
The AI is nondeterministic. Like a human actor, it gives a slightly different performance each time. Use the Temperature slider to control this variability:
- Lower temperature: More consistent, predictable outputs
- Higher temperature: More variation and expressiveness
Why does voice selection matter so much?
Voice selection has a tremendous effect on the output. The AI inherits characteristics from the selected voice:
- If you want an introspective voice, select a voice that sounds introspective
- If you want an energetic voice, select one cloned from energetic samples
- If you want a specific dialect, choose a voice trained on that dialect
If you provide audio with noise, reverb, or multiple speakers, the AI will be unstable. Always use clean, high-quality source material for best results.
How are credits calculated?
Credits are calculated based on the number of characters in your text and the normalization mode selected:
- Basic Normalization: x1 credits
- AI-Enhanced Normalization: x2 credits
What audio format is generated?
All generated audio is delivered in MP3 format at 128 or 192 kbps (44.1 kHz) for optimal quality and compatibility.
Is there a character limit?
| Mode | Character Limit | Approx. Duration |
|---|---|---|
| Emotions & Dialects Enabled | 3,000 characters | ~3 minutes |
| Standard Mode | 10,000 characters | ~10 minutes |
| Text to Audio (Quick) | 1,500 characters | ~1.5 minutes |
For longer content, use the Text to Audio Studio.
Can I use the generated audio commercially?
Yes, all audio generated with your account can be used for commercial purposes according to your subscription terms.
Troubleshooting
Audio sounds robotic
- Try increasing the Temperature setting for more natural variation
- Increase the Expressiveness slider slightly
- Enable Emotions & Dialects Mode for expressive content
- Try a different voice that matches your content style
Output is inconsistent or unstable
- Lower the Temperature setting
- Lower the Expressiveness slider
- Use longer prompts (250+ characters recommended for Emotions mode)
- Check that your source audio (for voice cloning) is clean and noise-free
Generation is slow
- Longer texts take more time to process
- AI-Enhanced normalization requires additional processing
- Check your internet connection
Unexpected pronunciation
- Use AI-Enhanced Normalization for Arabic text (adds diacritics automatically)
- Add manual diacritics (تشكيل) to ambiguous words
- Write numbers as words instead of digits
- Expand abbreviations (e.g., "Dr." → "Doctor")
- Try a different voice - some handle certain words better
Dialect not working correctly
- Ensure the selected voice matches the dialect in your text
- Use Basic Normalization for dialect content (not AI-Enhanced)
- Start your text with a strong dialect word to "prime" the model
- Enable Emotions & Dialects Mode
Voice clone doesn't sound like the original
- Increase the Similarity slider
- Ensure your training audio was high quality (no noise, echo, or reverb)
- Check that training audio had consistent tone and energy throughout
- If source had noise, lower the Similarity to reduce artifacts
Reach out to our support team at support@moknah.io and we'll help you resolve any issues.