Latin characters

This topic has 6 replies, 3 voices, and was last updated 12 years, 3 months ago by asivolella.

Viewing 7 posts - 1 through 7 (of 7 total)

Author

Posts
March 17, 2012 at 7:13 pm #459

vitormanfredini
Member

Hey Michael,

Lately, I’ve been playing around with closed captions. I got it working and decoding, but I’m not able to decode latin characters like “ç” or “é”. Am I missing something? I tried reading the serial port using UTF-8 but no luck.
Maybe the output is single byte encoding and i’m trying to read multibyte chars, not sure what could be the problem. Could you help me on this?

I also posted it as a comment on the video experimenter page but found out it wasn’t the place to do it.

Thanks
Vítor

March 18, 2012 at 1:24 pm #973

Michael
Keymaster

Have you looked at the closed captioning standard to understand the character set?

http://en.wikipedia.org/wiki/EIA-608
Compare with ASCII: http://www.asciitable.com/

The characters you mention are differences between the CC standard and ASCII. So, if you decode a character ‘{‘ (code 0x7B), it is actually a ‘ç’ character (ASCII 0x87). So, you can just do a translation in your code, I guess.

March 19, 2012 at 12:46 pm #902

vitormanfredini
Member

Michael,

You’re right!
I did a quick “translation” test and it worked.

Thank you for your help!
Vítor

March 19, 2012 at 1:31 pm #903

Michael
Keymaster

That’s great!

March 21, 2012 at 5:18 am #900

vitormanfredini
Member

Too soon… I’m running into another encoding problem that I don’t think is a “translation” issue.
I decoded, for example, the Closed Caption “VERDADE É UMA MARCA” as “VERDADE E UMA MARCA”. (note the missing accent) and also after the missed accent I get a line break from the part of the code which checks the last control code and sends that line break to the serial port.
I still believe it’s a multibyte encoding issue, that is, the closed caption data needs two bytes to form the character and I’m losing the second one or something like that.
Do you think it’s possible?
Thank you very much.
Vítor

March 22, 2012 at 2:45 pm #896

vitormanfredini
Member

I found some useful information on “extended characters” from a book on google books:
http://books.google.com.br/books?id=2S2RncK5aEoC&lpg=RA1-PA366&ots=hBuMhk8XZ0&vq=character%20set&dq=EIA-608%20character%20set&hl=pt-BR&pg=RA1-PA366#v=onepage&q&f=false

If you read the 3 first paragraphs on this page you will understand how the “Extended characters” work.
I will start modifying the decoder you wrote and I’d appreciate some help.
Thank you again.
Vítor

May 7, 2013 at 3:02 am #1646

asivolella
Member

Hi Vitor and Michael

I’m also brazilian, and I’m already being able to retrieve cc, but I’m facing the same issues as Vitor faced last year, when he wrote these posts =)
So, Vitor, do you have the adapted code that you used to manage this problem at that time? If yes, please, share with us!
Author

Posts

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic.