Each word is highlighted as it is spoken.
Each word is highlighted as it is spoken.
The Text-to-Speech API on windows is working completely and I have submitted the patch for the review.
I have not added the support for concurrent speech however and it will be taken care of at a later stage.
I will soon be uploading a video, where you can see the API in action, without trying out the patch.
Regarding the further work, I have already started on Mac and the good news is that, I have the speak() and getVoices() methods working on it.
Next steps would be,
> to get the patch for windows reviewed and landed, as soon as possible.
> to add more functions to the Mac API.
P.S. A very happy Independence day to all fellow Indians 🙂
After having a word with the mentor, I have already submitted a raw patch, so that it can be tested by others and I can get some early feedback.
You can find the patch here.
Regarding the to do’s,
-> Changing the volume, pitch and rate is working. For the volume part, I had to modify the nsISpeechService.idl so that we can also pass volume parameter to the speak() function.
-> the boundary events give a correct timestamp now. I have used a window’s in-built function for this.
-> I have tested the Pause and the Resume button. They are working perfectly.
Aim for the next week :
-> To test the cancel functionality
-> To make Sapi concurrent, so that we can have two browser windows using it independently.
Both of these will require a lot of testing.
The plan is to complete the work for windows in next two weeks, so that I can jump to the next platform.
Hopefully, by the next post, I would have completed all the major functionalities.
Stay tuned !
And, I have received the mid term payment 🙂
-> speak function is working for each call now.
-> I have added the following events support -> start/end of speech, word and sentence boundary.
-> The stop/pause/resume/interrupt functions for the speech.
-> Selecting a specific voice for the speech
These all functions have been tested here : http://itsyash.github.io/webspeechdemos/
The to-do’s for the next week would be:
Adding the support to change properties of the speech such as Pitch, Rate and Volume.
And then, after testing each functionality thoroughly, submit the patch for reviewing.
This week is also the mid term week. I am running a bit behind my schedule, but I have almost completed the Speech Adapter for windows, that was aimed for the mid term evaluation. The delay has been due to the time invested in learning the windows api.
And.., Mount Fuji awaits this weekend. Will update more in the next post.
P.S. Tokyo has awesome weather.
Stay tuned for more 🙂
Coming to the project, after spending hours on it, I have finally managed to make significant progress now. It has been quite challenging to dive simultaneously into the completely new windows APIs and the Mozilla codebase and then, integrating the two of them. But the support of this awesome community is what drives me to get through every hurdle I get.
I have a very basic version of Sapi Synthesis Service running on my system 😀 .
The current status of the service is that it supports speak() and getVoices() (details about each voice) methods (and some other minor methods also) The speak() method right now, works only for the first time, as I haven’t implemented nsISpeechTask and nsISpeechTaskCallBack interfaces right now.
That would be the first to do in the coming week.
I have also committed all the functionalities of speech api to my Sapi Git repository here , for the reference.
Next to do’s would be:
-> completely working speak() function
-> handling the pitch, rate etc. qualities
-> adding the tts events.
Also, I would be soon be writing about the my experience in writing the code till now and the difficulties that I have been facing, so that it serves as a good reference to others.
Keep an eye out for the next post 🙂
After going through the Pico Service, the previous week had two major tasks :
> Testing the pico on my Linux Machine
Regarding the first part, I have tested Pico on my linux. Here are the steps if anyone wants to try :
Install Pico on your linux. It’s a simple one-liner. After that
That’s it for this week, and btw ! I got myself a new Mac 😀
Cheers ! 🙂
After discussing with my mentor, Eitan Isaacson, my first step in the project, was to study the implementation of Pico service, to get inspiration for the future work.
After spending some time on the service, I have understood the basic workflow of the process. In this post, I would be explaining, or rather documenting the same, so that it helps in the future.
nsPicoService :> our main service, subclasses nsISpeechService.
Helper Classes :
The functions of the all the classes are defined in the following workflow:
The browser calls the speak method of the nsPicoService, with a reference to the nsISpeechTask object, along with four other parameters : text to utter, a unique voice identifier, rate to speak voice and the pitch.
Then, the PicoCallbackRunnable is executed on a new worker thread. In this process, all the text is fed to the engine in buffers of specified size and the output data from the engine is received in chunks.
This is it for this week. Next week, my aim would be to test the Pico service and with that, start with the windows.
I’ll be starting with the windows platform. Bug #1003457 will keep track of the development on Windows
After I am done with windows, I plan to move on to Mac and then, if time permits, Linux.
Bug #1003452 and Bug #1003464 will keep track of the development on Mac and Linux, respectively.
Before starting on windows. I’ll be studying the current Pico implementation.
A reverse proxy is a type of web server that retrieves resources on behalf of a client from one or more servers. These resources are then returned to the client as though they originated from the server itself.
So, basically a reverse proxy does the exact opposite of what a forward proxy does. While a forward proxy proxies in behalf of clients (or requesting hosts), a reverse proxy proxies in behalf of servers. A reverse proxy accepts requests from external clients on behalf of servers stationed behind it.
Say, you have a large web site that millions of people want to see, but a single web server cannot handle all the traffic. So you what you can do is
set up many servers, and put a reverse proxy on the internet that will send users to the server closest to them when they try to visit your site.
This is part of how the Content Distribution Network (CDN) concept works.It also helps in load balancing.
Lets jump to a practical example now.
We have a local server in IIIT, which is used for displaying the transcript/grades to the students, after filling their login credentials.
Now, when you are not in the IIIT netowrk, you can not access that server. This is an issue I had been facing from the first year, since mostly the grades come out in vacation time and we all are not in the campus.
Reverse proxy comes to the rescue here:
In our college, each student is provided with their own server on web.iiit.ac.in.(Mine is https://web.iiit.ac.in/~yashasvi.girdhar).
This server is inside the IIIT network but is accessible from outside.
So, what I did is, set up my web.iiit server as a reverse proxy server for the isas server.
The workflow is:
It basically contains a php script, that takes your login credentials and requests isas server on behalf of your system.
In the first response, the isas server responds with a brief response with some cookies. These cookies are used in the next request to get a detailed response of the transcript.
Link to the Code : https://github.com/itsyash/ReverseProxy
I have used Curl library. There can be many other options.
Feel free to use the code and report any issues.