Your Universal Remote Control Center
RemoteCentral.com
Philips Pronto Professional Forum - View Post
Previous section Next section Up level
Up level
The following page was printed from RemoteCentral.com:

Login:
Pass:
 
 

Topic:
to the pro's - could regular expressions parse this sort of data?
This thread has 10 replies. Displaying all posts.
Post 1 made on Wednesday December 22, 2010 at 18:04
mkleynhans
Long Time Member
Joined:
Posts:
August 2009
54
Hi there,
I have been fiddling around with the Sky satellite box and have been getting
data to display on a TSU9800 screen via an RFX9600 serial port.

I am getting something like this every 60 seconds or when you change channel..
(As documented on dusky-control.com)

.232SSCN010270SSCA009FXSSDT026 2.06pm
Sat 12 Nov SST00132.00pmSSN0012.JAG.
SSE0157Admiral Chegwidden and ClaytonWebb
make an unlikely team whenthey join forces
to save a CIA agent from Italian terrorists.
Starring: Catherine Bell.b3

Broken down, it means;

Means Type Length Data
==== ==== === ====
Ch No. SSCN 010 270
Ch Name SSCA 009 FX
Time/Date SSDT 026 2.06pm Sat 12 Nov
Start Time SST0 013 2.00pm
Show Name SSN0 012 JAG
Synopsis SSE0 157 Admiral Chegwidden and Clayton Webb ... blah blah...

Do you pro's reckon that using a form of regular expression would be able
to clean this up? Every response varies in length but the actual byte
length is specified after each code..
I really dont know where to start, do have the developers guide but havent
got a copy of JScript definitive guide (ordered).

What do you think?

Cheers,

Mike
Post 2 made on Thursday December 23, 2010 at 03:37
Lyndel McGee
RC Moderator
Joined:
Posts:
August 2001
13,006
As the lengths are spread thoughout the data, you will need to use String.substring() to extract each field. I do not thing RegExp will help you at all in this case.
Lyndel McGee
Philips Pronto Addict/Beta Tester
Post 3 made on Thursday December 23, 2010 at 03:39
MCFH
Long Time Member
Joined:
Posts:
December 2009
35
This will be relatively straight forwards in JavaScript but will need more than regular expressions - you will need to wait for data; search for the 0x0a byte and then parse it following the rules described on the site.

The other thing to watch for is it is not clear to me whether the 0x0a byte can only ever occur at the start of the message and not half way through.
Post 4 made on Saturday January 1, 2011 at 06:04
blakeBF
Lurking Member
Joined:
Posts:
August 2009
4
Hi Mike,

if you have full control (every 60 seconds or when changing channel) on start of data packets, it should be possible to parse the data in an easy way.

as you can see in your separated list

Ch No. SSCN 010 270
Ch Name SSCA 009 FX
Time/Date SSDT 026 2.06pm Sat 12 Nov
Start Time SST0 013 2.00pm
Show Name SSN0 012 JAG
Synopsis SSE0 157 Admiral Chegwidden and Clayton Webb ... blah blah...

every information starts with "SS". So you could use string.split() to separate fields. In the next step you could use regular expressions to match the separated strings. If the number of fields is constant you have additionally to check the number of splitted strings to identify whether a "SS" is contained in the data fields. In this case the order of data fields doesn't matter. This solution only works, if really every field starts with "SS".

regards
markus
Post 5 made on Saturday January 1, 2011 at 07:17
Paul Biron
Founding Member
Joined:
Posts:
August 2001
142
If you use this approach, you would also need to add logic in case the Show Name or Synopsis contain a word with "SS" in it (such as "SESSION", for example).

Paul
___________________
Pronto Level II Certified
Post 6 made on Saturday January 1, 2011 at 15:17
buzz
Super Member
Joined:
Posts:
May 2003
4,384
I briefly looked at [Link: dusky-control.com].

In terms of Pronto programming the most difficult part will be dealing with a partial response. Every packet will begin with \x0a, but one cannot know exactly what the last character will be until all of the data has been received. At the time Pronto polls the port, the packet may not be complete.

A fundamental choice will be to start parsing a partial result or wait for complete results.

Everyone has their own programming style. I tend to use recursive calls to parse input strings. In my opinion, regular expressions will not be very useful here because there is no predicable end character or string -- unless one waits for the next \x0a. Unfortunately, one must wait until the next response begins before the \x0a will arrive. One must assume that \x0a is unique. Regular expressions are very useful when complex regular features can be exploited to separate variable length fields. In this case we know the field length in advance.

A bomb proof routine would discard characters, waiting for the first \x0a, log the packet length, optionally start parsing the individual elements, continuing till the character count has been received, then validate the checksum. Once the checksum has been validated, start reacting to the data or issue an error message.
OP | Post 7 made on Sunday January 2, 2011 at 05:48
mkleynhans
Long Time Member
Joined:
Posts:
August 2009
54
Thanks chaps, it all sounds reasonable except I have absolutely no idea where to start. I have been through the developers help but this is way beyond that..
I bought access to the definitive guide rough cuts and still cant figure my way through it..
I was hoping that I could do something like below but im battling..

1. Extract the entire length of data from the Sky box.

2. Check each individual piece and look for the five defining bits

53 53 43 4e - SSCN - Channel Number
53 53 43 41 - SSCA - Channel Name
53 53 44 54 - SSDT - Current Time & Date
53 53 54 30 - SSTO - Programme Start Time
53 53 4e 30 - SSNO - Programme Name
53 53 45 30 - SSEO - Programme Description

3. Once I have found the bits that I am looking for, specify the next three bytes of data as a value for the amount of data to organise.

30 31 30 - 010
30 30 39 - 009
30 32 36 - 026
30 31 33 - 013
30 31 32 - 012
31 35 37 - 157

4. Extract the individual sections of data from the main paragraph using the defining bits as starting bits and the byte length as the amount of data to be extracted from start point.Assign the extracted data to five sections within my page.

5. Ideally if the data is not found, leave the existing data on the page.

Another problem is I read a guy called Dave's notes on his attempts to do this
back in 09 and he found that the buffer for the receive is restricted to 512bytes which sometimes is not enough. Some programme synopsis' is longer than that.

I have an RFX9600 and TSU9600 in the office that I have been trying with, will need to get this home to work on every night as its taking me ages and I have too much 'real work' to do =(

Still would be absolutely awesome to have that sort of info showing up on a page when watching sky and changing channels. I'm pretty suprised no others have given this a good bash as all of us UK users must be using sky (well most of anyway)

Thanks again for the help, I am trying to use your tips in the definitive guide to see if I can put anything toghether.

Cheers,

Mike
Post 8 made on Sunday January 2, 2011 at 07:46
johnmack
Long Time Member
Joined:
Posts:
October 2005
42
Mike

PM me your email address and I'll let you have my solution. I can't guarantee it as it stopped working when I changed HD boxes and I've never worked out why.

John
Post 9 made on Sunday January 2, 2011 at 10:29
buzz
Super Member
Joined:
Posts:
May 2003
4,384
mkleynhans,

Rather than scan for the field type (SSCN, etc.), I think that you should parse the packet using the field lengths -- which you know in advance. Once you have a complete field isolated, processing it is trivial. Scanning for field type is messy and will always fail, because you will not know where the last field ends until the next field, possibly in the next packet, arrives (if it ever arrives). Further, there are a variable number of characters (\x0a, length, checksum, possibly stray data) between fields.

Check out the Array methods of ProntoScript (JavaScript). A relatively straight forward method would be to periodically check for and push() new data on an array and shift() data from the beginning as you process. Since you will strip out fields as they are processed, typically, the first array element(s) will be the \x0a, a packet length, field type, or the checksum. Any other data conditions warrant an error message and a re-sync effort (scan for the next \x0a).

The RFX9600 makes this sort of interaction more difficult than necessary because of the 512 byte receive buffer limit and sending will flush the receive buffer. It is best to check the receive buffer immediately prior to sending and keep your fingers crossed that no receive data is lost while the send is in progress. The checksum will help you to determine if some receive data was lost. If the last packet was complete (the checksum was received and validated) and the next received character is the \x0a, probably no data was lost.

If the Pronto goes to sleep, assume data was lost.

A time proven way to approach this sort of task is to flowchart the process. If you cannot flowchart the task, you'll never be able to write a functional program. Start with large blocks and don't worry how to fill the blocks with program lines. For example the first cut might be:

Receive Data
Parse into fields
Process fields
Display

Next, pick one of the big blocks and break it down into smaller blocks. If you do not immediately know how to accomplish one of the smaller blocks, give the sticking part a name, move on, and revisit that block later -- breaking it down into ever smaller blocks. Eventually, those initially intimidating blocks will be broken down into ever smaller, easily implemented chunks. Make sure that your flow chart considers abnormal conditions, such as lost or garbled data, Pronto sleep, or a reboot.
OP | Post 10 made on Sunday January 2, 2011 at 12:09
mkleynhans
Long Time Member
Joined:
Posts:
August 2009
54
Buzz and John, thanks guys, the rough cuts sux for a newbie like me and the latest version of the definitive guide wont be shipped till feb so I am going to order the currently released version which should make things a bit easier..

Will give this a good going over next week!!

Cheers,

Mike
Post 11 made on Tuesday January 4, 2011 at 15:00
Lyndel McGee
RC Moderator
Joined:
Posts:
August 2001
13,006
The control panel uses Javascript 1.6 (Spidermonkey C engine) so you should be OK without having newest version of Flanagan book.

Note that you can google for things like.

+javascript +string +substring +function

and get hits for sites such as w3schools or the mozilla developer site which is where the javascript reference is locate.

Also googling for ECMA 262 and ECMA 357 will provide PDF references for the language and E4X XML Extensions. Very technical but useful if you want nitty-gritty details.

Also, IIRC, the buffer size in RFX9600 is 1024 and not 512 bytes.
Lyndel McGee
Philips Pronto Addict/Beta Tester


Jump to


Protected Feature Before you can reply to a message...
You must first register for a Remote Central user account - it's fast and free! Or, if you already have an account, please login now.

Please read the following: Unsolicited commercial advertisements are absolutely not permitted on this forum. Other private buy & sell messages should be posted to our Marketplace. For information on how to advertise your service or product click here. Remote Central reserves the right to remove or modify any post that is deemed inappropriate.

Hosting Services by ipHouse