Sunday, October 19, 2014

Life As A Cordcuttter

A few years ago, tired of paying a ton of money for satellite TV channels that I never watched, I "cut the cord" - canceled my pay TV subscription and began getting my TV fix through alternate means.  I had high-speed internet and no intention of giving that up, so it made sense to stream as much as possible over that connection.  Eventually I also installed a good-old outdoor aerial antenna to pick ip HD broadcasts from the closest major city.

Over the years I have used several "set top" solutions for my streaming, subscribed to several of the major streaming video providers, and played with a lot of additional software solutions for finding content.  I'm not going to describe everything there is to know about cutting the cord (not like I know it all anyway) as there are several good web sites that do that.  I'm going to tell you what works for me, what I like and don't like, and offer some tips should you choose to try this for yourself.

There are a couple of caveats I need to list right up front.  First, if you are a real sports fanatic, you probably won't want to get rid of your cable or dish subscription just yet.  Legal avenues for streaming sports - especially live sports - are mostly nonexistent, and the ones that do exist are often tied to having a cable or dish subscription.  There are some exceptions, being one of the standouts.  There are also ways to watch live sports streams which are of questionable legality, and which frequently suffer from poor quality.

Second, your internet connection needs to be stable and relatively fast.  5 megabit DSL is about the lowest you want to go for this, 10 - 15 is better, and to take advantage of Netflix's ultra HD streaming (4K HD) you will need 20 - 25 megabits per second of bandwidth.

Equipment - I use the Roku 3.   This little box retails for $99, although you can find deals on it pretty much everywhere.  You might want to look for it on Woot where refurbished models are frequently offered for $65.  The Roku 3 features up to 1080P video streaming, both wired and wireless networking, and a nifty remote control with a built-in headphone jack (so you or your partner can watch TV while the other sleeps in peace).

The Roku has channels for just about every major video service available - Netflix, Amazon Prime, Hulu Plus, Vudu, and many others.  It's about as simple to use as any such device can be.

The biggest downfall to the Roku is that it isn't a full-fledged computing device, so it can't readily browse content over the local network, and one majorly-helpful piece of software - XBMC - won't run on it.  All isn't lost, though - PLEX comes to the rescue.  Plex is a media server you run on your PC, cataloging all your local content (movies, TV shows, music, etc.) and downloading metadata.  You install a Plex client from the Roku channel store and use it to pull content from the PC.  Plex also has its own "channels" you can install, and these fill in gaps that exist both on the Roku and the Hulu Plus service.  For example, the Roku has no YouTube channel of its own, but Plex provides one.  And while CBS television is the only major broadcaster not present through Hulu Plus , there is a CBS channel that allows you to stream most of the current shows.

While there are many other equipment choices available, if I were looking to buy something new today, I would consider the Amazon Fire TV - it's about the same price as the Roku, can stream all the same major sources, and there is a build of XBMC that will run on it.

The only other piece of equipment I'd mention is that I have an antenna outside that picks up live HD broadcasts of the major networks.  The picture is crystal clear (better than I ever had with my small dish).

Content sources - I have subscriptions to Netflix, Amazon Prime, and Hulu Plus.  Each has its strengths and weaknesses.  Netflix has for some time been the reigning king of the streaming world, with a vast library of movies and TV shows.  The content is basically what would be available on DVD or Blu-Ray - in other words, nearly always slightly older movies and previous seasons of television shows.  They are offering some exclusive content these days as well.  I have had Netflix on and off a few times over the years.  I'm giving it another try right now, but mostly I am finding that nearly everything they have that I care to watch is available on Amazon Prime.

Amazon Prime is a movie and TV service similar to Netflix.  For a long time it was a pretty distant second in terms of the size of the library, but Amazon is catching up - fast.  More importantly, a Prime subscription gets you free 2nd-day shipping on much of what Amazon has for sale, access to their Kindle Lending Library with thousands of books, and now includes their music streaming service as well.  It's an amazing deal, and if you had to pick between Amazon and Netflix, Amazon would be a worthy choice (unless you just HAVE to watch the stuff that's exclusive to Netflix).

Hulu Plus has movies, but not many, and they're usually older.  What makes Hulu special is it is the only service with current-season episodes of many of your favorite shows.  Hulu is a partnership between NBC, FOX, ABC, the CW, and UPN.  Shows are usually available one day after the broadcasts air.  The big hole in Hulu's lineup is CBS, and it doesn't look like this is going to change, and CBS has announced their intention to offer a separate pay service.  As of now, you can watch CBS shows using the Plex channel.

XBMC - XBMC is a media center application that you can run on your PC or Mac, and on some set-top boxes as well.  XBMC catalogs local content (similar to Plex, which is actually built on some of XBMC's code), and offers many other sources of content.  XBMC is open-source software, and due to the number and type of add-on channels and plugins available, can be a bit daunting for people who aren't tech-savvy.  But it's an amazing resource if you take the time to get it set up.

Monday, October 13, 2014

What A Week

Last week I was "on-call" for work.  That meant I was responsible for watching our monitoring systems and problem queue, working problems as they arise if possible, coordinating efforts if it's something I need help resolving.  The first couple of days were pretty slow, a couple of failed power supplies in systems with redundant power, no biggie.

Thursday I got a call that users in our Mechanicsburg office were experiencing a lot of performance degradation.  A quick check of their primary MPLS circuit (from Level3) showed a lot of packet loss.  We have BGP configured to switch them over to Centurylink if Level3 fails, but the circuit hadn't actually dropped, so we forced it - shut BGP to Level3, and opened a problem ticket with them.  

A short time later, our monitoring tools reporting trouble reaching a router in Williamsport - another Level3 circuit, this time the backup circuit, normally only used when connecting to that one router.  We began thinking Level3 was having a bigger issue.  But before we could contact them to add the info to our ticket, we heard users in Harrisburg were having performance issues.  Level3 again, and the primary circuit - so we shut BGP there, forcing them over to a backup circuit from Verizon.  Finally we got the Level3 ticket updated with all the circuit information and waited for their response.  

About 3:00PM a bunch of us were supposed to go out to celebrate a teammate's birthday.  Right when I get to the bar, the phone rings - network admin requested to look at an application issue.  So I went back in and launched into one of those 3-hour marathon sniffer sessions. Fun!  I finally got out about 6:00PM and headed home.

On the way home I got a text message from Bank of America - fraudulent charge suspected on my debit card, please call or login to online banking to check.  Peachy.  As I walked into my house around 7:00PM, my cellphone rang - a guy at work who was going to swap some potentially bad GBICs on a fiber, wanted me to make sure we had traffic off the link.  

I decided to call back from my landline because cell coverage at home is spotty.  I picked up the phone, dial-tone.  Luckily I still had DSL service.  I got logged in, called him from my cell, and got that one worked.

In the meantime I opened a chat session with the phone company's tech support.  They wanted me to swap phones or try the test jack outside the house.  No good - I didn't have a spare phone, and the one I did have was a cordless that requires power for the base station.  I would have to wait until I could get another phone on Friday to find out if it was my problem or the phone company.

Finally I logged into BoA's web site.  Yep, somebody tried to access my account from a Publix supermarket down in Florida.  Of course as soon as I marked the charge fraudulent, BoA promptly canceled my debit card and notified me it would be 5 - 7 days to get a new one.  You just have to love the modern world, right?  I checked my wallet - $5 cash, maybe with that and the change I keep in the jar at work I would be able to eat on Friday.

Friday morning, we had an email from Level3 waiting for us.  They had found a problem with a core router serving a bunch of their customers in the northeast, and routed around it.  After talking it over with my director and teammates, we decided to keep Mechanicsburg and Harrisburg on their backup circuits for the day and watch the Level3 circuits.  If everything held up we would re-enable BGP over Level3 sometime Friday night.

Two hours later the Verizon circuit to Harrisburg died.  Just plain died.  And with BGP shut over the Level3 circuit, they were cut off completely.  We dialed into a modem on an emergency backup router and got BGP going again on Level3 to get them back online.  Total time of that outage was maybe 5 minutes.

Friday afternoon rolls around and I got talked into trying another social outing.  But just when it was time to leave, I got asked to look at another issue - a file transfer running over a point-to-point circuit between Florida and Pennsylvania was running slow.  In fact, it had been running slow all week, but no one had asked for help until Friday afternoon.  AAAUUUGGGHHH!  So another night not getting off until 6:00PM, not getting home until 7:00PM.  And to make it more interesting, it looked like there was packet loss going from us to the remote site - on a Level3 circuit.  Not MPLS, true, but another Level3 circuit in Pennsylvania?  They claimed to have routed around their other issue, but at this point we were getting gun-shy about putting anything else on their network if we didn't have to (Harrisburg notwithstanding).

On the way home I stopped at Target and bought a plain-old telephone that doesn't need external power.  When I got home I plugged it in inside the house - no dial-tone.  I took it out to the box outside - no dial-tone.  Ok, it's the phone company's problem.  I went in to do another online chat session with tech support, but now I had no DSL.

I got on the cell phone to call the phone company and halfway through one of the half-dozen prerecorded messages, the call dropped.  I dialed back, worked my way through the menus - and got dropped listening to the same message.  Now, they say that doing the same thing over and over and expecting a different result is one definition of insanity.  I must be insane, because I tried a third time.  And got dropped during the same message.  Finally I called in and just kept hitting "0" on every menu and eventually got a live person.  Of course, all they could tell me was they didn't see any trouble in my area, couldn't call my house phone (duh) and couldn't see any signal from my computer.   That, and they couldn't send anyone to the house to fix it during the weekend unless I paid, otherwise I would have to wait until Monday for a visit from a tech (I was still on-call for the weekend), and I would have to stay home from work to meet the tech or they wouldn't come (despite the fact that the issue was clearly NOT inside me house).

So today is Monday.  The tech came.  They had moved my circuits last week to a new switch and somehow failed to configure my service.  

The good news is, I'm not on-call again for about 7 weeks.  


Bloody Turnips

“You can’t squeeze blood from a turnip.”  This old saying is a way of expressing that some things are so obviously impossible that they aren’t worth trying, that they are a waste of time.  But sometimes the problem isn’t that we’re trying to squeeze blood from a turnip - the problem is assuming that we’re looking at a turnip in the first place.

The other day I got “the call.”  “The call” usually comes late in the day, and frequently on a Friday.  It’s when someone has been working at a problem all day, or all week, realizes they are running out of time, and in a last ditch effort at a resolution they ask for a network admin to take a packet trace.  And I’m the person that frequently gets “the call.”

This time it was an application which picks up files from a server, the application was locking up, and the people troubleshooting it explained that this is frequently a sign that there was a delay in picking up the files (this application was said to be super time-sensitive).  Server admins had found nothing wrong on the file server.  I was asked to see if there was anything causing network-based latency, or if I could at least see something in the trace that might account for the issue.

I have to admit that I did not approach this problem with any enthusiasm.  I have a life.  I do not like getting called at 3:00PM to start a multi-hour troubleshooting session on something this vague.  But it’s part of the job, these were my customers, and apparently nobody else was making any headway (including the vendor of the application, who had been called in to work on it).

Now despite being pretty good with the sniffer - and sometimes enjoying the challenge - I know that it can be a hard way to get to the root of a problem, so I made an effort to do things the easier way.  I asked the usual questions - when did the problem start, did something change, could I get a more technically accurate description of the problem, etc.  I looked at the basics - located and checked for errors on the switch ports of the file server and application system and so forth.  And then, reluctantly, I fired up the sniffer and got started.

About an hour into the session, one of my teammates came up to watch, and he asked the obvious question - “Do you really think you’re going to find the problem by looking at the packet contents?”  He was, in essence, asking me if I was trying to squeeze blood from a turnip.  And honestly I did not know how to answer him.

It’s something I’ve thought about often over the years.  I am very interested in troubleshooting - the thought processes that go into it, the practice of it, the techniques that are used.  I think that the act of trying to reverse-engineer an application by staring at the sniffer until it feels like my head is bleeding is a really hard way to do things.  But while I have not come up with a lot of amazing answers to those questions, I have learned one thing:

I can’t solve a problem if I don’t try.

There are a lot of times it feels like I’m squeezing a turnip.  But the truth is I don’t know what I’m squeezing.  It’s like sticking my hand in a bag and grabbing something, and squeezing it, and after a long time I get some blood out of it - in which case I find that it wasn’t a turnip.  And sometimes I get nothing but a turnip guts.

So I just said to him - “I have no idea.”  And I kept on squeezing.

I’d like to conclude this post by telling you about the amazing discovery I made in the packet trace.  Unfortunately that didn’t happen.  What did happen is I was able to determine that when the application freezes up, it isn’t waiting for anything from the file server.  The application was getting a response that looked “complete” (for you packet monkeys, it had the PUSH flag set on the last packet of the response), the application system responded with an immediate ACK, and then sat there for a long time before doing anything else.  Then the application system sent a packet and things started up again.  I saw this happen multiple times during “freezing” episodes.

What does it mean?  Well, it means the problem isn’t a delay in getting information from the file server.  There could be a problem in the contents of the response, and being unfamiliar with the application itself I couldn’t speak to that.  Or there could be something happening on the application system causing it to freeze that has nothing to do with the network traffic.

This information didn’t solve the problem for the application folks.  It did get the file server admins off the hook, and it pretty well proved the network infrastructure wasn’t at issue, and it gave the application admins and their vendor a little push in the direction of looking at their own system a little harder.  I hope it helped.

If there is a message here, it’s this - troubleshooting can be a painful, frustrating, and sometimes ultimately unrewarding process.  Problems can be really complicated, the tools can be hard to use, and the whole thing can just be a lot of work.  Even when you try your best you don’t always come up with a big win.  But if you don’t try, you don’t stand a chance.  I think a lot of people - including a lot of network people - think that problems can't be solved with a sniffer, or maybe that they can't solve them, so they don't try.  All I can say is, I've done it often enough to know it's not impossible.  Working a problem with a sniffer isn't always fruitless.  So the moral of the story?

Keep squeezing.