With speech interfaces turning into extra of a factor, it’s value exploring among the issues we are able to do with speech interactions. Like, what if let’s imagine one thing and have that transcribed and pumped out as a downloadable PDF?
Properly, spoiler alert: we completely can do this! There are libraries and frameworks we are able to cobble collectively to make it occur, and that’s what we’re going to do collectively on this article.
These are the instruments we‘re utilizing
First off, these are the 2 massive gamers: Subsequent.js and Categorical.js.
Subsequent.js tacks on extra functionalities to React, together with key options for constructing static websites. It’s a go-to for a lot of builders due to what it affords proper out of the field, like dynamic routing, picture optimization, built-in-domain and subdomain routing, quick refreshes, file system routing, and API routes… amongst many, many different issues.
In our case, we undoubtedly want Subsequent.js for its API routes on our consumer server. We would like a route that takes a textual content file, converts it to PDF, writes it to our filesystem, then sends a response to the consumer.
Categorical.js permits us to get slightly Node.js app going with routing, HTTP helpers, and templating. It’s a server for our personal API, which is what we’ll want as we cross and parse knowledge between issues.
Now we have another dependencies we’ll be placing to make use of:
react-speech-recognition: A library for changing speech to textual content, making it accessible to React parts.regenerator-runtime: A library for troubleshooting the “regeneratorRuntime will not be outlined” error that reveals up in Subsequent.js when utilizing react-speech-recognitionhtml-pdf-node: A library for changing an HTML web page or public URL right into a PDFaxios: A library for making HTTP requests in each the browser and Node.jscors: A library that permits cross-origin useful resource sharing
Establishing
The very first thing we wish to do is create two mission folders, one for the consumer and one for the server. Identify them no matter you’d like. I’m naming mine audio-to-pdf-client and audio-to-pdf-server, respectively.
The quickest method to get began with Subsequent.js on the consumer facet is to bootstrap it with create-next-app. So, open your terminal and run the next command out of your consumer mission folder:
npx create-next-app consumer
Now we want our Categorical server. We will get it by cd-ing into the server mission folder and working the npm init command. A bundle.json file shall be created within the server mission folder as soon as it’s completed.
We nonetheless want to truly set up Categorical, so let’s do this now with npm set up categorical. Now we are able to create a brand new index.js file within the server mission folder and drop this code in there:
const categorical = require(“categorical”)
const app = categorical()
app.pay attention(4000, () => console.log(“Server is working on port 4000”))
Able to run the server?
node index.js
We’re going to want a pair extra folders and and one other file to maneuver ahead:
Create a parts folder within the consumer mission folder.Create a SpeechToText.jsx file within the parts subfolder.
Earlier than we go any additional, we now have slightly cleanup to do. Particularly, we have to substitute the default code within the pages/index.js file with this:
import Head from “subsequent/head”;
import SpeechToText from “../parts/SpeechToText”;
export default operate Dwelling() {
return (
<div className=”residence”>
<Head>
<title>Audio To PDF</title>
<meta
title=”description”
content material=”An app that converts audio to pdf within the browser”
/>
<hyperlink rel=”icon” href=”/favicon.ico” />
</Head>
<h1>Convert your speech to pdf</h1>
<primary>
<SpeechToText />
</primary>
</div>
);
}
The imported SpeechToText part will ultimately be exported from parts/SpeechToText.jsx.
Let’s set up the opposite dependencies
Alright, we now have the preliminary setup for our app out of the way in which. Now we are able to set up the libraries that deal with the information that’s handed round.
We will set up our consumer dependencies with:
npm set up react-speech-recognition regenerator-runtime axios
Our Categorical server dependencies are up subsequent, so let’s cd into the server mission folder and set up these:
npm set up html-pdf-node cors
In all probability a very good time to pause and ensure the recordsdata in our mission folders are in tact. Right here’s what you must have within the consumer mission folder at this level:
/audio-to-pdf-web-client
├─ /parts
| └── SpeechToText.jsx
├─ /pages
| ├─ _app.js
| └── index.js
└── /types
├─globals.css
└── Dwelling.module.css
And right here’s what you must have within the server mission folder:
/audio-to-pdf-server
└── index.js
Constructing the UI
Properly, our speech-to-PDF wouldn’t be all that nice if there’s no method to work together with it, so let’s make a React part for it that we are able to name <SpeechToText>.
You may completely use your personal markup. Right here’s what I’ve received to provide you an thought of the items we’re placing collectively:
import React from “react”;
const SpeechToText = () => {
return (
<>
<part>
<div className=”button-container”>
<button sort=”button” type={{ “–bgColor”: “blue” }}>
Begin
</button>
<button sort=”button” type={{ “–bgColor”: “orange” }}>
Cease
</button>
</div>
<div
className=”phrases”
contentEditable
suppressContentEditableWarning={true}
></div>
<div className=”button-container”>
<button sort=”button” type={{ “–bgColor”: “purple” }}>
Reset
</button>
<button sort=”button” type={{ “–bgColor”: “inexperienced” }}>
Convert to pdf
</button>
</div>
</part>
</>
);
};
export default SpeechToText;
This part returns a React fragment that incorporates an HTML <“part“> aspect that incorporates three divs:
.button-container incorporates two buttons that shall be used to start out and cease speech recognition..phrases has contentEditable and suppressContentEditableWarning attributes to make this aspect editable and suppress any warnings from React.One other .button-container holds two extra buttons that shall be used to reset and convert speech to PDF, respectively.
Styling is one other factor altogether. I gained’t go into it right here, however you’re welcome to make use of some types I wrote both as a place to begin in your personal types/world.css file.
View Full CSS
html,
physique {
padding: 0;
margin: 0;
font-family: -apple-system, BlinkMacSystemFont, Segoe UI, Roboto, Oxygen,
Ubuntu, Cantarell, Fira Sans, Droid Sans, Helvetica Neue, sans-serif;
}
a {
colour: inherit;
text-decoration: none;
}
* {
box-sizing: border-box;
}
.residence {
background-color: #333;
min-height: 100%;
padding: 0 1rem;
padding-bottom: 3rem;
}
h1 {
width: 100%;
max-width: 400px;
margin: auto;
padding: 2rem 0;
text-align: heart;
text-transform: capitalize;
colour: white;
font-size: 1rem;
}
.button-container {
text-align: heart;
show: flex;
justify-content: heart;
hole: 3rem;
}
button {
colour: white;
background-color: var(–bgColor);
font-size: 1.2rem;
padding: 0.5rem 1.5rem;
border: none;
border-radius: 20px;
cursor: pointer;
}
button:hover {
opacity: 0.9;
}
button:lively {
remodel: scale(0.99);
}
.phrases {
max-width: 700px;
margin: 50px auto;
peak: 50vh;
border-radius: 5px;
padding: 1rem 2rem 1rem 5rem;
background-image: -webkit-gradient(
linear,
0 0,
0 100%,
from(#d9eaf3),
color-stop(4%, #fff)
) 0 4px;
background-size: 100% 3rem;
background-attachment: scroll;
place: relative;
line-height: 3rem;
overflow-y: auto;
}
.success,
.error {
background-color: #fff;
margin: 1rem auto;
padding: 0.5rem 1rem;
border-radius: 5px;
width: max-content;
text-align: heart;
show: block;
}
.success {
colour: inexperienced;
}
.error {
colour: purple;
}
The CSS variables in there are getting used to regulate the background colour of the buttons.
Let’s see the most recent modifications! Run npm run dev within the terminal and test them out.
You must see this in browser once you go to http://localhost:3000:
Our first speech to textual content conversion!
The primary motion to take is to import the mandatory dependencies into our <SpeechToText> part:
import React, { useRef, useState } from “react”;
import SpeechRecognition, {
useSpeechRecognition,
} from “react-speech-recognition”;
import axios from “axios”;
Then we test if speech recognition is supported by the browser and render a discover if not supported:
const speechRecognitionSupported =
SpeechRecognition.browserSupportsSpeechRecognition();
if (!speechRecognitionSupported) {
return <div>Your browser doesn’t assist speech recognition.</div>;
}
Subsequent up, let’s extract transcript and resetTranscript from the useSpeechRecognition() hook:
const { transcript, resetTranscript } = useSpeechRecognition();
That is what we want for the state that handles listening:
const [listening, setListening] = useState(false);
We additionally want a ref for the div with the contentEditable attribute, then we have to add the ref attribute to it and cross transcript as youngsters:
const textBodyRef = useRef(null);
…and:
<div
className=”phrases”
contentEditable
ref={textBodyRef}
suppressContentEditableWarning={true}
>
{transcript}
</div>
The very last thing we want here’s a operate that triggers speech recognition and to tie that operate to the onClick occasion listener of our button. The button units listening to true and makes it run constantly. We’ll disable the button whereas it’s in that state to stop us from firing off extra occasions.
const startListening = () => {
setListening(true);
SpeechRecognition.startListening({
steady: true,
});
};
…and:
<button
sort=”button”
onClick={startListening}
type={{ “–bgColor”: “blue” }}
disabled={listening}
>
Begin
</button>
Clicking on the button ought to now begin up the transcription.
Extra capabilities
OK, so we now have a part that may begin listening. However now we want it to do a number of different issues as properly, like stopListening, resetText and handleConversion. Let’s make these capabilities.
const stopListening = () => {
setListening(false);
SpeechRecognition.stopListening();
};
const resetText = () => {
stopListening();
resetTranscript();
textBodyRef.present.innerText = “”;
};
const handleConversion = async () => {}
Every of the capabilities shall be added to an onClick occasion listener on the suitable buttons:
<button
sort=”button”
onClick={stopListening}
type={{ “–bgColor”: “orange” }}
disabled={listening === false}
>
Cease
</button>
<div className=”button-container”>
<button
sort=”button”
onClick={resetText}
type={{ “–bgColor”: “purple” }}
>
Reset
</button>
<button
sort=”button”
type={{ “–bgColor”: “inexperienced” }}
onClick={handleConversion}
>
Convert to pdf
</button>
</div>
The handleConversion operate is asynchronous as a result of we’ll ultimately be making an API request. The “Cease” button has the disabled attribute that may be be triggered when listening is fake.
If we restart the server and refresh the browser, we are able to now begin, cease, and reset our speech transcription within the browser.
Now what we want is for the app to transcribe that acknowledged speech by changing it to a PDF file. For that, we want the server-side path from Categorical.js.
Establishing the API route
The aim of this route is to take a textual content file, convert it to a PDF, write that PDF to our filesystem, then ship a response to the consumer.
To setup, we’d open the server/index.js file and import the html-pdf-node and fs dependencies that shall be used to jot down and open our filesystem.
const HTMLToPDF = require(“html-pdf-node”);
const fs = require(“fs”);
const cors = require(“cors)
Subsequent, we’ll setup our route:
app.use(cors())
app.use(categorical.json())
app.submit(“/”, (req, res) => {
// and many others.
})
We then proceed to outline our choices required with a view to use html-pdf-node contained in the route:
let choices = { format: “A4” };
let file = {
content material: `<html><physique><pre type=’font-size: 1.2rem’>${req.physique.textual content}</pre></physique></html>`,
};
The choices object accepts a worth to set the paper measurement and elegance. Paper sizes comply with a a lot totally different system than the sizing items we usually use on the net. For instance, A4 is the everyday letter measurement.
The file object accepts both the URL of a public web site or HTML markup. As a way to generate our HTML web page, we’ll use the html, physique, pre HTML tags and the textual content from the req.physique.
You may apply any styling of your alternative.
Subsequent, we’ll add a trycatch to deal with any errors that may pop up alongside the way in which:
attempt {
} catch(error){
console.log(error);
res.standing(500).ship(error);
}
Subsequent, we’ll use the generatePdf from the html-pdf-node library to generate a pdfBuffer (the uncooked PDF file) from our file and create a singular pdfName:
HTMLToPDF.generatePdf(file, choices).then((pdfBuffer) => {
// console.log(“PDF Buffer:-“, pdfBuffer);
const pdfName = “./knowledge/speech” + Date.now() + “.pdf”;
// Subsequent code right here
}
From there, we use the filesystem module to jot down, learn and (sure, lastly!) ship a response to the consumer app:
fs.writeFile(pdfName, pdfBuffer, operate (writeError) {
if (writeError) {
return res
.standing(500)
.json({ message: “Unable to jot down file. Strive once more.” });
}
fs.readFile(pdfName, operate (readError, readData) {
if (!readError && readData) {
// console.log({ readData });
res.setHeader(“Content material-Sort”, “utility/pdf”);
res.setHeader(“Content material-Disposition”, “attachment”);
res.ship(readData);
return;
}
return res
.standing(500)
.json({ message: “Unable to jot down file. Strive once more.” });
});
});
Let’s break that down a bit:
The writeFile filesystem module accepts a file title, knowledge and a callback operate that may returns an error message if there’s a difficulty writing to the file. If you happen to’re working with a CDN that gives error endpoints, you would use these as a substitute.The readFile filesystem module accepts a file title and a callback operate that’s succesful or returning a learn error in addition to the learn knowledge. As soon as we now have no learn error and the learn knowledge is current, we’ll assemble and ship a response to the consumer. Once more, this may be changed along with your CDN’s endpoints in case you have them.The res.setHeader(“Content material-Sort”, “utility/pdf”); tells the browser that we’re sending a PDF file.The res.setHeader(“Content material-Disposition”, “attachment”); tells the browser to make the acquired knowledge downloadable.
Because the API route prepared, we are able to use it in our app at http://localhost:4000. We will the proceed to the consumer a part of our utility to finish the handleConversion operate.
Dealing with the conversion
Earlier than we are able to begin engaged on a handleConversion operate, we have to create a state that handles our API requests for loading, error, success, and different messages. We’re going use React’s useState hook to set that up:
const [response, setResponse] = useState({
loading: false,
message: “”,
error: false,
success: false,
});
Within the handleConversion operate, we’ll test for when the online web page has been loaded earlier than working our code and ensure the div with the editable attribute will not be empty:
if (typeof window !== “undefined”) {
const userText = textBodyRef.present.innerText;
// console.log(textBodyRef.present.innerText);
if (!userText) {
alert(“Please communicate or write some textual content.”);
return;
}
}
We proceed by wrapping our eventual API request in a trycatch, dealing with any error which will come up, and updating the response state:
attempt {
} catch(error){
setResponse({
…response,
loading: false,
error: true,
message:
“An sudden error occurred. Textual content not transformed. Please attempt once more”,
success: false,
});
}
Subsequent, we set some values for the response state and likewise set config for axios and make a submit request to the server:
setResponse({
…response,
loading: true,
message: “”,
error: false,
success: false,
});
const config = {
headers: {
“Content material-Sort”: “utility/json”,
},
responseType: “blob”,
};
const res = await axios.submit(
“http://localhost:4000”,
{
textual content: textBodyRef.present.innerText,
},
config
);
As soon as we now have gotten a profitable response, we set the response state with the suitable values and instruct the browser to obtain the acquired PDF:
setResponse({
…response,
loading: false,
error: false,
message:
“Conversion was profitable. Your obtain will begin quickly…”,
success: true,
});
// convert the acquired knowledge to a file
const url = window.URL.createObjectURL(new Blob([res.data]));
// create an anchor aspect
const hyperlink = doc.createElement(“a”);
// set the href of the created anchor aspect
hyperlink.href = url;
// add the obtain attribute, give the downloaded file a reputation
hyperlink.setAttribute(“obtain”, “yourfile.pdf”);
// add the created anchor tag to the DOM
doc.physique.appendChild(hyperlink);
// drive a click on on the hyperlink to start out a simulated obtain
hyperlink.click on();
And we are able to use the next under the contentEditable div for displaying messages:
<div>
{response.success && <i className=”success”>{response.message}</i>}
{response.error && <i className=”error”>{response.message}</i>}
</div>
Remaining code
I’ve packaged every part up on GitHub so you may take a look at the total supply code for each the server and the consumer.
Changing Speech to PDF with NextJS and ExpressJS initially printed on CSS-Tips. You must get the publication.
Subscribe to MarketingSolution.
Receive web development discounts & web design tutorials.
Now! Lets GROW Together!