Difficulty Level = 8 [What’s this?]
How Does Closed Captioning Work?
Closed captioning is the technology used to embed text or other information in an NTSC television broadcast (North America, Japan, some of South America). It is typically a transcription of the broadcast audio for the benefit of hearing impaired viewers. No doubt, you’ve all seen closed captions displayed on a TV, but how does it work? This project will explain how closed captioning technology works and then show you how you can decode and display the data using your Arduino and a Video Experimenter shield. There a lot to learn here, so be patient! First, you can take a look at a video showing this capability, then keep reading to learn how it works.
The data that your TV displays is embedded in the broadcast itself in a special format, and in a special location of the video image. When you activate the closed captioning feature on your TV, your TV decodes the information and displays it on the screen. Whether you are displaying it or not, the data is in the broadcast encoded on line 21 of the video frame. This is defined by the standard EIA-608. Here is what the line 21 signal looks like:
This shows the voltage of a composite video signal for line 21. The horizontal sync and color burst are just like any other video line, but the section called “clock run-in” is a special sinusoidal wave that allows the TV to synchronize with the closed captioning data which is about to start. The 7-peak run-in is followed by 3 start bits with values of 001. You can see how the voltage rises for the third bit S3. The next 16 bits represent two 8-bit characters of text. That’s right, there are only two characters per video frame, but at 30 frames per second, there is enough bandwidth for closed captions. The last bit of each byte b7 is an odd parity bit. Parity bits are an error detection mechanism. That is, this bit is either on or off in order to keep the total bits in the byte at an odd number. So, if bits b0-b6 have 4 bits on, then the parity bit is on to achieve an odd number of bits (5).
Capturing and Decoding the Data
So, how do we capture and decode this data using the Video Experimenter? We need to use the TVout library for Video Experimenter used with all Video Experimenter projects. The code for this project is in the example “ClosedCaptions” in the TVout/examples folder. You may already know from other Video Experimenter projects that we can capture a video image in the TVout frame buffer. For this project, we just want to capture the line 21 data so we can decode it. This is accomplished with the API method: tv.setDataCapture(int line, int dataCaptureStart, char *ccdata) where ‘line’ is the TVout scan line to capture, ‘dataCaptureStart’ is the number of clock cycles on that line to wait before starting to capture, and ‘ccdata’ is a buffer to store the bits in. Typically, we do something like this:
unsigned char ccdata[16]; // 128 pixels wide is 16 bytes ... tv.setDataCapture(13, 310, ccdata);
Even though the data is on line 21, I have found it to be on line 13 or 14 as far as TVout is concerned. The value of 310 for dataCaptureStart is the value I have found to work best in order to fit both characters of data in the width of the TVout frame buffer. This will make more sense later when we visually look at the pixels captured. It may take a while to “find” the data by trying different lines and different values for dataCaptureStart to get the right alignment. Just try different values. I have also needed to adjust the small potentiometer near the reset button upward a bit. A resistance of around 710K was required instead of the standard 680K required by the LM1881 chip on the Video Experimenter. You’ll know when you’ve found the data when you see a data line like in the images below. Sometimes you might find data that is not closed captions, but information about the program, like the title, etc. This is called XDS or Extended Data Services. This can be interesting information to decode also!
Once we tell enhanced TVout where to find the data, the buffer ccdata will always contain the pixels of the specified line of the current frame. If we display the captured pixels on the screen we can visually see how it matches up with the line 21 waveform. To produce the picture below, I copied the contents of ccdata to the first line of the TVout frame buffer so we can see the data with our eyes. The data appears as white pixels at the top of the image. It isn’t necessary to display it on the screen in order to decode it and write it to the Serial port. But it makes it easier to find the data visually and see what’s going on.
On the left side we can see the last 2 peaks of the clock run in sine wave. Then we clearly see the start bits 001. Each bit is about 5 or 6 pixels wide. Then there are 7 zero bits (pixels off) and the parity bit (on). When this picture was taken, no dialog was being spoken, so the characters are all zero bits except for the parity bit. When text data is being broadcast, the bits flash very quickly:
Now that we have found the data in the broadcast, and can display it for inspection, we need to decode this 128-bit wide array of pixels into the two text characters. To do that, we need to note where each bit of the characters starts. Each bit is 5 or 6 pixels wide. The next step I took in my program was to define an array of bit positions that describe the starting pixel of each bit:
byte bpos[][8]={{26, 32, 38, 45, 51, 58, 64, 70}, {78, 83, 89, 96, 102, 109, 115, 121}};
These are the bit positions for the two bytes in the data line. By displaying these bit positions just below the data line, we can adjust them if needed by trial and error. Here’s an image with the bit positions displayed below the data line. Since each data bit is nice and wide, they don’t have to line up perfectly to get reliable decoding. These positions have worked well for me for a variety of video sources.
OK, we are almost done. Now that we have found the closed caption data line, and have established the starting points for each bit, we can easily decode the bits into characters and write them to the serial port for display on a computer. We can also just print them to the screen if we want. I have taken care of all this code for you, and it is all in the example called “ClosedCaptions” in the TVout library for Video Experimenter.
If you have problems finding the data, try different lines for the data (13 or 14), different values for dataCaptureStart, and adjust both potentiometers on the Video Experimenter. Try slowly turning the small pot near the reset button clockwise. If you are patient, you’ll find the data and decode it!
Other project ideas
- Instead of writing the data to the serial port, write it to the screen itself with tv.print(s)
- Search for keywords in a closed captions and light an LED when the word is found.
Why do you only get 5 or 6 pixels per bit? I assume that the video data is standard 640 pixels across, so the two 8-bit bytes, 3 start bits, and a couple of cycles of the sine wave should work out to around 30 pixels per bit.
I mention this because many asynchronous serial receivers using 32x oversampling, and then they look for the rising or falling edge of the start bit to determine alignment. Once you know where the start bit lies, you should be able to predict where each bit lies. The ideal would be to look in the middle of each bit area. With ~30 pixels per bit, you could scan for the three start bits (001) and align on the rising edge of the third bit (1). From there, skip 15 pixels ahead to find the middle of the area, then jump forward 30 pixels to read each subsequent bit.
Oops, now I realize that the LM1881 is not a video frame grabber, but just provides sync. So the A/D is not fast enough to capture more than 128 pixels per line. A faster A/D would allow 640 pixels or 720 pixels, or I suppose even 1920 pixels per line. With a faster A/D, your CC decoding would probably work better. Also, and FPGA or similar could be programmed like an async serial receiver that is specific to CC patterns, and then you could just load the 2 bytes per line directly into the CPU. You’d still need the ability to select between line 21 or the other ones (13, 14).
rsdio: You are correct that the limitation is speed. I’m not using ADC, I’m using the analog comparator which is faster than a true ADC. The real speed limitation is the clock speed of the MCU. If you look at the assembly code where I capture image data, you can see that it takes 5 clock cycles to do the work (store the analog comparator result in frame buffer, increment stuff) for each pixel. It actually takes 3 cycles and there’s a 2 cycle NOP, but the last pixel of the byte takes more time, so I’m bound by that.
At 16MHz, this means I can only capture 128 pixels across. In short, there isn’t enough time to capture a higher resolution. There isn’t enough memory either, but that’s a whole other constraint. See my article about the Seeeduino Mega for doing higher resolution overlay.
Hi. I was wondering is there anyway of encoding Closed Captioning Data using an Arduino? Thanks.
The hard part of generating a closed caption signal for display on a TV is that you need to generate a sine wave with a particular frequency, and the timing needs to be just right.
Isn’t CEA-708 the relevant spec?
CEA-708 seems to be the spec for digital TV signals. For analog, it’s EIA-608.
http://en.wikipedia.org/wiki/EIA-608
Both 608 and 708 may be in use. Sometimes one is primary language and the other is secondary language.
And sometimes for broadcast they have to be tweaked to egt them in the right place…
Hi Michael,
Was very happy to come across your page and this particular project.
I am very interested in this process of pulling the words out of the caption,
and would love to chat to see if we have the same ideas regarding this
technology.
Thanks for your time,
Bryan Amburgey
513.293.6788
Los Angeles, CA
Thanks for creating this! i’m using it for a kinetic sound installation using ‘f’ words.
so far the text is really jumbled, barely coherent, but i will persevere.
I’ll send you a video link when the project is done :)
For analog, it’s 608. And 608 holds CC1, CC2, CC3, and CC4. CC3 and CC4 are in field 2, while the other two are in field 1. I think Michael is decoding only field 1. I don’t know if he’s separating out CC1 from CC2, although it is rare for a show to mix these (but the Oprah Show used to do that for before it went off-air).
Field 2 also has other potentially interesting things, like the name of the show, the name of the next show to air, station id, etc, in addition to CC3/CC4 (in what is called an XDS stream).
For digital, the standard is 708, which has an embedded 608 stream. Few programs really fully use the 708 standard, though. You usually get away with just decoding the 608 stream.
Hi Michael,
I’ve found out how PAL captions works. http://www-user.tu-chemnitz.de/~heha/vt25/tele1.pdf, most PAL countries use the standard in the PDF. The main difference is:PAL uses line 21;PAL uses 8 clock run-in instead of 7;start bit is 11100100 rather than 001, so what changes we have to do to your lib to make it works with PAL.
Thanks
So.. can this be used as a regular CC decoder can? By that I mean to add subtitles to a film on laserdisc or something right on the TV? Without doing any mofiication to the code?
Yes, you can use TV.print(c) to write the characters to the screen instead of writing them to the Serial line. I assume your TV doesn’t decode closed captions (that would be rare).
WOW cool project.
Is it also possible to generate Closed Captioning with an arduino and decode it with a second unit?
Thx
Andy
No, it’s not possible to generate the required analog waveforms to create a closed captioning signal.
Hi Michael,
Great work! Very interesting project you’ve done.
I got the video experimenter shield and have been testing it.
Lately, I’m playing around with closed captions. I got it working and decoding, but I’m not able to decode latin characters like “ç” or “é”. Am I missing something? I tried reading the serial port using UTF-8 but no luck.
Maybe the output is single byte encoding and i’m trying to read multibyte chars, not sure what could be the problem. Could you help me on this?
Thanks
Vítor
Hi there,
I’ve just got a VE and it’s a great piece of kit.
Does anyone know how to get Closed Caption working with PAL (UK)? At the moment I’m just getting a solid white line at the top of the screen where the CC pixels should be.
Thanks,
Angus
Vitor
Cjange the baud rate on the serial monitor to 57600 and it will work.
Very nice!
HI, I can not synchronize the text, so get Y # Y # Y $ by serial. any tips
Angus: I don’t know if you are still be around here but… Most of the rest of the world outside North America uses Teletext (a.k.a BBC Ceefax) which encodes far more bits into a line than EIA-608. It transmits 45 bytes of dat (41 7-bit bytes after run-in and framing) in each of 17 lines from line 6 through 22. There’s probably no way to decode this at reduced resolution.
TVR: aside from CC1/CC2 on Field 1 and CC1(3)/CC2(4) on Field 2, there’s also T1/T2 on Field 1 and T1(3)/T2(4) on Field 2. So, 4 total data channels on Field 1 and 5 on Field 2.
Please do not comment to ask for help. See full product details or use the support forum to get help.
Has anyone been able to find CC data in a commercially produced DVD? I am using a DVD that has the “CC” logo on it, but I can’t seem to find the signal.
Yes, I have. You may need to try looking on different lines.
error ,class TVout’ has no member named ‘setDataCature’
plz halp
You need to install the TVout library properly.
Great!really liked. I am looking for a way to convert string following the pattern eia 608 to send that signal to an encoder. Have you seen anything about it? Congratulations on your project!
Hello, With this is there anyway to determine when a tv show switches to commercials? It would be great to make some smart remote to skip commercials.
Hello,
I have been playing with the decoding, but cannot tell when the video signals change based upon a tv show or a commercial. It would be great to distingush between celebrity names on a broadcast show or a commercial. My wife watches shows that are about some celebrities that i also want to mute. Any way to add the ability to deciefer the difference?
I don’t think you can distinguish between commercials and program. This has always been intentional, or it would be too easy for equipment to mute commercials or stop recording during commercials.
Can I use this module to get Closed Captions from a NTSC videotape? I am in Europe (PAL) but I have a VCR that can play back NTSC VHS in NTSC. I would like to display Closed Caption but I don’t have a NTSC TV.
Yes, you can, but without a display it would be hard to calibrate.
A Question. Just got VE. See only solid white line across top of tv screen. Comes and goes while turning POT. Tried line numbers 13,14 21 and some random DataCaptureStart values with no progress. No Serial monitor output after testing different baudrates. Passing U.S. Cable video stream thru VE to Toshiba TV. Seems that Angus had similar issue at question #17 above. Any ideas?
Added info: Trying to get Closed Caption text in serial monitor. Where’s the setup tutorial videos showing the complete process from start to finish? That’s what we need in addition to videos showing WHAT VE does.
Dee, please keep trying different lines. This project is not guaranteed to work with every video stream, and this is a very advanced project. Having said that, many people have gotten it to work.
Michael,
Thank You very much for you prompt response to my Closed Captioning Blank White Line issue. I’ll start with line =1 and continue to increment by 1 until I get results. An Idea: Maybe an initial setup program to test varying line numbers in 5 second increments while displaying the line number being tested. Users would know exactly which line number worked without frequent code edits and uploads. I understand this may not be feasible given possible required fixed hardware states on initialization.
Again, thanks for your feedback.
Video Experimenter (Closed Captioning) Setup Details:
The overlayDemo works fine. I cycled thru the code, varying line numbers 0-43 with no top monitor display beyond 43.The closed Captioning serial monitor show only gibberish text and never seem to be pausing/resuming or otherwise synchronized with the actual TV closed captioning text. For line 43, I see 4 or 5 unstable, flickering dashes across the screen top even during channels with no cloased captioning, I turn the pot slowly counterclockwise until the top flickering dashes disappear. So, I figure that’s no good. I see two stable dashes using line 0. I’d love to get this working. Anyone with insights, please help.
Closed Captioning:
Actually starting pot fully ccw and slowly turning clockwise until top graphics dissapear. Speed at 57600 baud. Which display, if any, is closest to focus on? Any new settings to try?
Thank You.
I guess Closed-Captioning support is inactive. (Dee 33-39)
Dee, I don’t know what else to tell you beyond what I did to make it work. The project is from 8 years ago, and is probably the most difficult, hackiest project here. It’s a miracle it worked for anyone, including me! I cannot guarantee it will work with every video signal in the world.
I wish to duplicate a successful use. Please list here, or provide a link to, your demo specifications. Please include your TV model and broacast signal type( in the UK I believe)
Thank You
Dee, this project was successful for me using a U.S. cable TV signal and a signal from a DVD player. TV model does not matter, as that is the output device. Remember, this project is from EIGHT YEARS AGO.
I’m assuming this cannot be used with Netflix or Hulu captions, as the signals are not being sent out in the same way?
Correct, because those are online streaming services.
Great information and 8 years? Appreciate the commitment keep me in the loop on other TV output projects thanks
Excellent work!
Can anyone tell me if advertisements are ‘flagged’ in anyway pithing the CC data?
I’d like to automatically mute the television when adverts come on.