Muhummad Luqman
ยทFlutterFlow Expert | Flutter | Turn ideas into reality | MVP and SaaS Expert | Software Engineer

Custom Speech-to-Text Widget (Real-Time Transcription)

Hey everyone ๐Ÿ‘‹

I recently built a custom Speech-to-Text (STT) widget in FlutterFlow and wanted to share how I approached it โ€” in case it helps someone working on voice-based features.

What this does:

  • Converts speech to text in real-time

  • Continuously updates the input field with recognized words

  • Tap to start/stop listening

  • Handles session states (listening, stopped, errors)

  • Includes Lottie animation for better UX

Key Approach:

Instead of directly using the STT plugin inside the widget, I created a singleton manager to handle:

  • Initialization (only once)

  • Listening state globally

  • Callbacks for results, status, and errors

This avoids multiple initializations and keeps things clean when reused across screens.

Widget Parameters:

The widget is designed to be reusable and flexible:

  • width (double?)
    โ†’ width of the widget

  • height (double?)
    โ†’ height of the widget

  • onTextRecognized (Future Function(String)) โœ… Required
    โ†’ Callback that returns the recognized speech text in real-time

  • currentText (String?)
    โ†’ Existing text (used to append new speech instead of replacing it)

Tech Used:

  • speech_to_text package

  • Custom FlutterFlow widget

  • Lottie animations for voice feedback

  • Callback-based text handling

โšก Use Cases:

  • Chat/messaging apps

  • AI assistants

  • Form inputs with voice

  • Accessibility features

// Automatic FlutterFlow imports
import '/backend/backend.dart';
import '/backend/schema/structs/index.dart';
import '/backend/schema/enums/enums.dart';
import '/flutter_flow/flutter_flow_theme.dart';
import '/flutter_flow/flutter_flow_util.dart';
import '/custom_code/widgets/index.dart'; // Imports other custom widgets
import '/custom_code/actions/index.dart'; // Imports custom actions
import '/flutter_flow/custom_functions.dart'; // Imports custom functions
import 'package:flutter/material.dart';
// Begin custom widget code
// DO NOT REMOVE OR MODIFY THE CODE ABOVE!

import 'package:speech_to_text/speech_to_text.dart' as stt;
import 'package:google_fonts/google_fonts.dart';
import 'package:lottie/lottie.dart';

// โ”€โ”€โ”€ Singleton STT Manager โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

typedef OnSpeechResult = void Function(String text);
typedef OnStatusChange = void Function(String status);

class SpeechSttManager {
  SpeechSttManager._internal();
  static final SpeechSttManager _instance = SpeechSttManager._internal();
  factory SpeechSttManager() => _instance;

  final stt.SpeechToText _speech = stt.SpeechToText();

  bool _available = false;
  bool _initialized = false;

  OnSpeechResult? onResult;
  OnStatusChange? onStatus;
  OnStatusChange? onError;

  Future<void> initialize() async {
    if (_initialized) return;

    _available = await _speech.initialize(
      onStatus: (status) => onStatus?.call(status),
      onError: (error) => onError?.call(error.errorMsg),
    );

    _initialized = true;
  }

  bool get isAvailable => _available;
  bool get isListening => _speech.isListening;

  Future<void> startListening({required OnSpeechResult onSpeech}) async {
    onResult = onSpeech;
    if (!_available) return;
    await _speech.listen(
      onResult: (result) => onResult?.call(result.recognizedWords),
      listenFor: const Duration(minutes: 2),
      pauseFor: const Duration(seconds: 5),
      localeId: 'en_US',
    );
  }

  Future<void> stopListening() async => await _speech.stop();
  Future<void> cancelListening() async => await _speech.cancel();
}

// โ”€โ”€โ”€ Widget โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

class SpeechRecognizer extends StatefulWidget {
  const SpeechRecognizer({
    super.key,
    this.width,
    this.height,
    required this.onTextRecognized,
    this.currentText,
  });

  final double? width;
  final double? height;
  final Future Function(String recognizedText) onTextRecognized;
  final String? currentText;

  @override
  State<SpeechRecognizer> createState() => _SpeechRecognizerState();
}

class _SpeechRecognizerState extends State<SpeechRecognizer> {
  final _sttManager = SpeechSttManager();

  bool _isListening = false;
  bool _isInitializing = true;
  String _textBeforeSession = '';

  @override
  void initState() {
    super.initState();
    _initSpeech();
  }

  Future<void> _initSpeech() async {
    _sttManager.onStatus = (status) {
      print("STT status: $status");
      if (status == 'done') _resetUI();
    };

    _sttManager.onError = (error) {
      print("STT error: $error");
      _resetUI();
    };

    await _sttManager.initialize();

    if (!mounted) return;
    setState(() => _isInitializing = false);
  }

  Future<void> _startListening() async {
    if (!_sttManager.isAvailable || _isListening) return;

    _textBeforeSession = widget.currentText ?? '';
    setState(() => _isListening = true);

    await _sttManager.startListening(
      onSpeech: (words) {
        if (words.isNotEmpty) {
          final combined =
              _textBeforeSession.isEmpty ? words : '$_textBeforeSession $words';
          widget.onTextRecognized(combined);
        }
      },
    );
  }

  Future<void> _stopListening() async {
    await _sttManager.stopListening();
    _resetUI();
  }

  void _resetUI() {
    if (!mounted) return;
    WidgetsBinding.instance.addPostFrameCallback((_) {
      if (!mounted) return;
      setState(() => _isListening = false);
    });
  }

  @override
  void dispose() {
    _sttManager.stopListening();
    _sttManager.onStatus = null;
    _sttManager.onError = null;
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    final primary = FlutterFlowTheme.of(context).primary;

    if (_isInitializing) {
      return SizedBox(
        width: widget.width,
        height: widget.height,
        child: const Center(child: CircularProgressIndicator()),
      );
    }

    return SizedBox(
      width: widget.width,
      height: widget.height,
      child: Column(
        mainAxisAlignment: MainAxisAlignment.center,
        children: [
          Lottie.asset(
            'assets/jsons/audio_visualizer_animation.json',
            width: 150,
            height: 90,
            animate: _isListening,
          ),
          const SizedBox(height: 20),
          GestureDetector(
            onTap: _isListening ? _stopListening : _startListening,
            child: Container(
              width: 72,
              height: 72,
              decoration: BoxDecoration(
                color: primary,
                shape: BoxShape.circle,
              ),
              child: Icon(
                _isListening ? Icons.pause : Icons.mic,
                color: Colors.white,
                size: 32,
              ),
            ),
          ),
          const SizedBox(height: 16),
          Text(
            _isListening ? "Listening..." : "Tap to speak",
            style: GoogleFonts.inter(fontWeight: FontWeight.w500),
          ),
        ],
      ),
    );
  }
}

Would love to hear your thoughts or if anyone has implemented this differently ๐Ÿ™Œ

7
1 reply