Description

Book Synopsis

.- Speech.

.- Lightweight Target-Speaker-Based Overlap Transcription for Practical Streaming ASR.

.- An Empirical Analysis of Discrete Unit Representations in Speech Language Modeling Pre-training.

.- Optimizing ASR Models with Semantic Information.

.- Efficient Enhancement of Norwegian ASR Model.

.- Towards Stable and Personalised Profiles for Lexical Alignment in Spoken Human-Agent Dialogue.

.- Audio–Vision Contrastive Learning for Phonological Class Recognition.

.- TOSD-Net: A CNN-Transformer Architecture for Robust Frame-Level Overlapping Speech Detection in Diverse Acoustic Conditions.

.- An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS.

.- Emotion-Aware Speech-Driven Facial Avatar Animation via Joint Blendshape Prediction and Emotion Recognition.

.- Beyond Static Emotions: Leveraging Multitask Learning to Model Dynamics of Dimensional Affect in Speech.

.- Implicit Speaker Group Encoding in Self-supervised Speech Recognition Models.

.- Combining Temporal Visual Dynamics and Audio Representations for Robust Speaker Identification.

.- Sentences vs Phrases in Neural Speech Synthesis: the Phrases Strike Back.

.- Evaluating Phoneme-Level Pretraining in Czech Text-to-Speech Synthesis.

.- Unifying Global and Near-Context Biasing in a Single Trie Pass.

.- Synthesising Cross-Speaker Data for Low-Resource Pathological Speech Recognition with PEFT.

.- Multilingual Stutter Event Detection for English, German, and Mandarin Speech.

.- How Far Can Synthetic Speech Go? Enhancing ASR in Low-Resource Scenarios via Voice Cloning.

.- Enhancing Detection of Parkinson-induced Dysarthria with Cross-lingual Transfer Learning.

.- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks.

.- Detection of Cognitive Disorders Using ASR-Based Nonsense Words Repetition.

.- Mind the Gap: Entity-Preserved Context-Aware ASR for Structured Transcriptions.

.- Boosting CTC-Based ASR Using LLM-Based Intermediate Loss Regularization.

.- Robust Disfluency Labeling in Spontaneous Speech: Insights from Diverse Hungarian Corpora Including Mentally Ill Speakers.

.- ParCzech4Speech: A New Speech Corpus Derived from Czech Parliamentary Data.

.- Towards an Accurate Domain-Specific ASR: Transcription for Pathology.

.- Automated Speaking Assessment for L2 Learners of Czech.

.- Inclusive ASR for Critical Public Services: Debiasing with Actor-Simulated Speech.

.- RECA-PD: A Robust Explainable Cross-Attention Method for Speech-based Parkinson's Disease Classification.

.- Systematic FAIRness Assessment of Open Voice Biomarker Datasets for Mental Health and Neurodegenerative Diseases.

.- When Silence Speaks: Understanding Open-Ended Responses via LLMs in Therapeutic Voice Interaction.

.- Multilingual Domain Adaptation for Speech Recognition Using LLMs.

.- Using Cross-attention For Conversational ASR Over The Telephone.

Text Speech and Dialogue

    Product form

    £104.49

    Includes FREE delivery

    RRP £109.99 – you save £5.50 (5%)

    Order before 4pm tomorrow for delivery by Wed 17 Jun 2026.

    A Paperback by Kamil Ekštein

    15 in stock


      View other formats and editions of Text Speech and Dialogue by Kamil Ekštein

      Publisher: Springer
      Publication Date: 22/09/2025
      ISBN13: 9783032025470, 978-3032025470
      ISBN10:

      Description

      Book Synopsis

      .- Speech.

      .- Lightweight Target-Speaker-Based Overlap Transcription for Practical Streaming ASR.

      .- An Empirical Analysis of Discrete Unit Representations in Speech Language Modeling Pre-training.

      .- Optimizing ASR Models with Semantic Information.

      .- Efficient Enhancement of Norwegian ASR Model.

      .- Towards Stable and Personalised Profiles for Lexical Alignment in Spoken Human-Agent Dialogue.

      .- Audio–Vision Contrastive Learning for Phonological Class Recognition.

      .- TOSD-Net: A CNN-Transformer Architecture for Robust Frame-Level Overlapping Speech Detection in Diverse Acoustic Conditions.

      .- An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS.

      .- Emotion-Aware Speech-Driven Facial Avatar Animation via Joint Blendshape Prediction and Emotion Recognition.

      .- Beyond Static Emotions: Leveraging Multitask Learning to Model Dynamics of Dimensional Affect in Speech.

      .- Implicit Speaker Group Encoding in Self-supervised Speech Recognition Models.

      .- Combining Temporal Visual Dynamics and Audio Representations for Robust Speaker Identification.

      .- Sentences vs Phrases in Neural Speech Synthesis: the Phrases Strike Back.

      .- Evaluating Phoneme-Level Pretraining in Czech Text-to-Speech Synthesis.

      .- Unifying Global and Near-Context Biasing in a Single Trie Pass.

      .- Synthesising Cross-Speaker Data for Low-Resource Pathological Speech Recognition with PEFT.

      .- Multilingual Stutter Event Detection for English, German, and Mandarin Speech.

      .- How Far Can Synthetic Speech Go? Enhancing ASR in Low-Resource Scenarios via Voice Cloning.

      .- Enhancing Detection of Parkinson-induced Dysarthria with Cross-lingual Transfer Learning.

      .- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks.

      .- Detection of Cognitive Disorders Using ASR-Based Nonsense Words Repetition.

      .- Mind the Gap: Entity-Preserved Context-Aware ASR for Structured Transcriptions.

      .- Boosting CTC-Based ASR Using LLM-Based Intermediate Loss Regularization.

      .- Robust Disfluency Labeling in Spontaneous Speech: Insights from Diverse Hungarian Corpora Including Mentally Ill Speakers.

      .- ParCzech4Speech: A New Speech Corpus Derived from Czech Parliamentary Data.

      .- Towards an Accurate Domain-Specific ASR: Transcription for Pathology.

      .- Automated Speaking Assessment for L2 Learners of Czech.

      .- Inclusive ASR for Critical Public Services: Debiasing with Actor-Simulated Speech.

      .- RECA-PD: A Robust Explainable Cross-Attention Method for Speech-based Parkinson's Disease Classification.

      .- Systematic FAIRness Assessment of Open Voice Biomarker Datasets for Mental Health and Neurodegenerative Diseases.

      .- When Silence Speaks: Understanding Open-Ended Responses via LLMs in Therapeutic Voice Interaction.

      .- Multilingual Domain Adaptation for Speech Recognition Using LLMs.

      .- Using Cross-attention For Conversational ASR Over The Telephone.

      Recently viewed products

      © 2026 Book Curl

        • American Express
        • Apple Pay
        • Diners Club
        • Discover
        • Google Pay
        • Maestro
        • Mastercard
        • PayPal
        • Shop Pay
        • Union Pay
        • Visa

        Login

        Forgot your password?

        Don't have an account yet?
        Create account