Thursday, February 26, 2015

James's Rules of Troubleshooting

While digging through some old documents this afternoon, I came across something I had written about 5 or 6 years ago.  It's a pretty fair summarization of some things I have learned about troubleshooting.  I've made a few minor edits, but for the most part it's just as I wrote it back then.


James’s Rules of Troubleshooting

Before - (things to do/know prior to a problem – when something breaks it’s too late to start working on these):

  • Know your stuff.  Be an expert on the technology for which you’re responsible.  “SME” means Subject Matter Expert – be one.  It’s MUCH easier to spot what’s wrong if you know what “right” looks like.
  • Know where to find your product manuals, configurations, and logs.  Know what’s in the logs. Know how to read them.  Make sure logs are tuned to show the right amount of data, kept for a sufficient period of time to be useful, and are time-synchronized with everything else in the network.  Keep frequent backups of configurations / changes.
  • Have the tools you need.  Have them installed.  Have them up-to-date.  Know how to use them. Try not to get tied to a single tool, no matter how good it is (when all you have is a hammer, everything looks like a nail).  Know more than one way to skin the cat.
  • Work on your people skills.  You got into I.T. to avoid dealing with people?  Hopefully I’m not the first person explaining to you that this is not possible.  The network, computers, and software exist to serve people (commonly known as “users”) and you will need to be able to deal with them.  This includes people on other teams and from other technical disciplines. Network people need to be able to talk to server people, Unix people need to be able to talk to Windows people, etc.  Don’t let personal issues fester – they’ll get in the way at the worst times.
  • Work on your communication skills.  Be able to speak and write clearly.  Clarity and accuracy are very important.  To the degree that it is possible, clarity is achieved by being as simple as you can be while retaining accuracy.  Get comfortable standing in front of small (5 to 10 people) and medium sized (10 to 25 people) groups and explaining how your technology works, how it ties in with the rest of the system, etc.  Be good at drawing diagrams.  Have a diagram of what your stuff looks like before you ever need it.

During (what to do while working a problem):

  • Get a clear description of the problem.  This is often harder than it sounds.  “Users”  can talk in vague terms – “Everything is slow.” “The network is broken.”  You’ll need to elicit the right information through direct questions.  “Exactly what were you doing when the problem occurred?  Were there any error messages displayed on screen?  Were other applications affected?”  You may have to repeat questions multiple times in order to get the user to answer what you’re asking.
  • Oddly enough, it can be even harder to get a clear story from a technician – they may be giving an edited version of events based on their own bias (they think their stuff can’t be broken, or that they know where the problem lies).  They will also likely want to tell you all the steps that were tried before you were called.  Unless they kept very good records of what was done, in what order, and the results of every test, that information is likely to be less than helpful.  If you don’t feel you can completely rely on the source, no matter how good your relationship, it is best to do your own investigation – “See for yourself!”
  • If this is a problem with something that has been previously working and is now broken, find out what changed (if anything).  Software updated (on server or workstations or network infrastructure)?  Hardware changed?  Don’t be too quick to dismiss something that you don’t think is related – the installation of new DNS servers really can be the cause of slow network performance logging into Unix boxes.  Use your company’s change control record-keeping system to research.
  • Check your stuff first.  When you are asked to join in a troubleshooting effort, make sure your components are not misconfigured or broken.  When someone asks you to take a look, your response should not be “My part can't be broken…” – rather, it should be “I’ll go check that out and get right back to you.”  If you are prepared (see the “Before” section) this should take very little effort.
  • If the problem turns out to be your area, you need to fix it, but you also need to report honestly and accurately to the team or leadership.  I won’t tell you which one to do first – that depends on what’s broken, the rules at your company, etc.  But you should make that report a priority. Honesty can be difficult when the problem reveals a personal error.  All I can do is urge you to suck it up and do the right thing – it’s worked remarkably well for me over a long period of time.  People understand that if you’re not making the occasional mistake, you probably aren’t working.  They will respect honesty (as long as you’re learning from your mistakes and not making the same ones over and over).
  • If the problem is not in your stuff, offer to help other areas however you can.  If you got called in because you have a reputation as a great troubleshooter, this will already be understood.  But even if that’s not the case, you may have valuable insight to offer – or maybe just a fresh set of eyes (and brains). If you can’t help directly, you can still learn a lot by watching the process and observing the resolution.
  • Try to recreate the problem.  If you can’t test things in the production environment, try to set up a test-bed.
  • Compare working and non-working configurations.  Got two servers that are supposed to be doing the same thing, but one isn’t working?  Find out how they are different!
  • Be persistent.  Don’t give up.  Stick with the job.  
  • Know when to get help.  Some I.T. people are loathe to open problem records with a vendor (including yours truly).  But if a production system is down and the bottom line is hurting, bring the vendor in earlier rather than later.  There maybe a known issue that they can recognize quickly.
  • Take breaks.  During a protracted issue, you’ll want to rest your brain on occasion.  Failing to do so can cause you to overlook otherwise obvious problems – when you look at something for too long, it starts to look normal even if it’s broken!  Get up and walk.  Drink water.  Don’t forget to eat.

After – because eventually, the problem will be solved…

  • FIX the problem.  If workarounds were applied, remove them.  Patch software.  Reconfigure equipment.  Whatever was broken, make it whole again.  “Bandaids” that are applied to get through an initial rough spot should not be considered a complete fix.
  • Bring the system back to “standard”.  Don’t settle for a one-off solution that no one will remember exists in a week.  If there is something wrong with the standard, FIX THE STANDARD.
  • Document what you did.  The standard documentation for the technology in question should be updated to reflect new configurations, new software versions, etc.  Diagrams should be updated.  The problem isn’t fixed until this is done.
  • Be able to express in clear language to your superiors and peers what happened, the steps taken to fix it, etc.  Make sure you understand what happened, in technically accurate terms.
  • Review your performance (and that of your peers) during the troubleshooting process.  Did you discover systems that aren’t time synchronized?  Fix it! Did you discover you really DON’T know how to use that nifty tool?  Practice!  Did you take too long to come to the right conclusion, perhaps overlooking data that was obvious in hind-sight?  Review the process you followed and try to understand how you might have gotten to the correct solution more quickly. The after-action "lessons learned" session is a valuable opportunity, not to be wasted.

Monday, February 23, 2015

Sloppy Programming Leads To Confusion

Recently I posted on Google+ about a deal on a Brother laser printer (the HL-L2380DW), and then wrote a blog post about it.  A Google+ user who bought the same printer ran into a problem accessing the web interface of the printer after he got it on the network.  Here's why:

When you set up the brother on your network for the first time, you'll probably let it get assigned an IP address automatically from your home router using dynamic host configuration protocol, or DHCP.  That means you won't necessarily be able to predict what IP address has been assigned.  You might be able to find it by looking at your router if it has a way of displaying what IP addresses it's given out, but most people will simply use the LCD touchscreen on the printer to browse to the information, and therein lies the problem.

Brother displays IP addresses with leading zeros padding the octets of the IP address.  For example, my printer at home has IP address 192.168.0.22.  But on the display, the printer shows it as 192.168.000.022.  If you weren't familiar with the normal notation of an IP address and tried to enter this into a web browser, you might get some odd results.

I have three browsers on my desktop system - Safari, Firefox, and Chrome.  Interestingly, Safari was able to bring up the printer's web interface with no apparent difficulty, while leaving the display on the URL bar showing 192.168.000.022.  Firefox also left the display alone, but was not able to bring up the page.  Chrome interpreted the last octet as being written in octal (base 8) due to the leading zero, and converted it to decimal 18 (presumably it also detected the third octet as octal as well, but since 000 in octal and 0 in decimal are the same thing, it doesn't really matter).  Chrome changed the URL display to read 192.168.0.18, and failed to contact the printer.

A user commenting on the Google+ post commented that this behavior (interpreting an octet in an IP address as octal due to a leading zero) is expected, if perhaps unhelpful.

It would be nice if Brother could correct the display on the LCD to prevent confusion like this from occurring.

A Quick Note

I have signed up for AdSense so there will be ads appearing on the blog.  For the time being there is only one, on the sidebar, and placed below other relevant stuff so as not to get in the way.  I think there's plenty of room in the page design for it.

I do not ask for, nor do I expect, readers to click the ads, unless the subject appeals to you and you would have done so anyway.  I have a good job and am not planning to make a career of this.  If the ads bother you or you would like to give me any other feedback regarding the blog, please do drop me a note.  Thanks!

Saturday, February 21, 2015

First Impressions - CaseLogic 14-inch Checkpoint Friendly Laptop Bag

Back in 2007 I bought a MacBook Pro 17-inch laptop.  It was as sleek as it could be, but still a bit of a beast to haul around.  At the same time, I purchased a laptop backpack big enough to carry it comfortably (one of those SwissGear suckers everyone carries these days).  I pretty much stopped using the MacBook years ago, but I've used the backpack for seven and a half years, day in and day out, to work and back, and on numerous trips.  It's been a gem.

Lately though I've wanted something a bit smaller and more convenient to use on a day-to-day basis.  I recently bought a Toshiba Chromebook 2, a light and thin laptop with a 13-inch screen.  The old backpack just swallows it up.  Even throwing in my kindle, wallet, phone, and a bunch of other junk I don't really need to tote everywhere I go, there was lots of wasted space.  After 7 years the plastic wrap on the top handle-strap is breaking apart, and well, I've been itching to make a change.

My choice was the Case Logic 14-inch Checkpoint Friendly Laptop Bag (see it at Case Logic's website or over at Amazon where you can read some other reviews).  The bag just arrived, and I have some first impressions.  

Let me start by saying that this appears to be a well-made bag.  The material is the pretty standard black canvas, seems water-resistant, with solid zippers and strap attachments.  The handles are made of the canvas cloth but are well padded and I think will be comfortable for everyday carrying.

What makes this a "Checkpoint Friendly" bag?  You can read the official TSA writeup, but basically it's a laptop bag that can be run through the security scanner without removing the laptop.  The laptop compartment has to be specially designed so that there are no zippers or buckles that would interfere with the scanner's view of the laptop.  Virtually all such bags are designed to "zip open" so that the laptop compartment lies flat with everything else out of the way.

With that in mind, the first thing to note is that this bag does indeed have a dedicated laptop compartment that appears to meet the TSA requirements.  The compartment itself has a set of zippers that only unzip partway down the sides, so that you don't accidentally unzip it far enough for the laptop to fall out.  This review is for the 14-inch version, and my Toshiba Chromebook 2 fits comfortably in the compartment with maybe a half-inch clearance on either side.  Note that you are not supposed to put ANYTHING else in this compartment, including charger, cables, mice, or anything else that might obstruct the scanner's view, and frankly the space is too tight for anything but the laptop.

The next most important feature is a set of zippers, just forward (towards the front of the bag) from the laptop compartment.  These do unzip clear down the sides of the bag, and allow it to lay flat.  When open, the laptop compartment lies by itself on one side, and anything else in the bag on the other.  The laptop side has a red cloth tag with the words "Security Friendly" on it, I suppose just in case you get a TSA agent who hasn't seen one of these before.  The opposite side of this compartment has a mesh pocket where you could conceivably put some papers or something.  I am thinking I will never put anything in there, just so that when this baby is lying flat on the scanner there will be nothing exposed.

Now I've described two sets of zippers already, and there are more to go.  At first I was envisioning myself fumbling around in the security line at the airport trying to figure out what to unzip, but CaseLogic has me covered - the zippers that open the bag flat have red stitching on them, where all the other zippers are just plain black.  Once you notice this there is no way you're going to make a mistake.  What is a little tricky is that there are cloth handles sewn on either side of this opening, as well as two strap buckles, one on each side of the compartment, at opposite sides of the bag (diagonal from one another).  With the handles and strap buckles and the strap itself, it can feel a little clumsy getting ahold of the zippers and getting them open.  I suspect this will become second nature after a few tries.

I mentioned buckles on either side of that last section, and a strap.  The reason they're placed on either side of this compartment is so that after the bag comes through security, you can just grab the strap and snatch it up, and the bag will automatically close (albeit still unzipped).  If you are trying to rush through security this could save you a little time.  If you are having trouble envisioning this, check out the CaseLogic site link above, where there are pictures of this function in action.

Just forward from the "lay flat" section is another full-sized, zippered compartment.  On one side is a silk-like cloth pocket that could be used for pretty much anything.  However the other side has a padded, felt-lined pocket designed for a tablet.  This is a nice touch - a bag designed for both a laptop and tablet!  I don't carry a tablet these days, but I'm sure my Kindle will be right at home here.

Again towards the front of the bag is a smaller compartment with various pockets for things like passport, credit or business cards, etc.  There is an attachment point for a keychain, and a couple of pen-holders in there.  Finally, right on the front of the bag is yet another small compartment for, well, anything you didn't stick somewhere else, I suppose.

I have transferred pretty much my whole kit from the old laptop bag to the new CaseLogic bag.  It's definitely a bit more snug - moving from a laptop backpack designed for a 17" laptop down to a briefcase-style bag designed for a 14-inch laptop, there's obviously going to be less room.  Of course that also translates to "carry less junk", which is not a bad thing at all.  It appears that even stuffed full it will have no problem fitting under an airplane seat in front of me when I travel.

I think I'm going to love this bag.  I'll be taking it to work Monday, and in a couple of weeks it's going with me to Belize.  I'll update this post with final impressions after the trip.

The Case Of The Silence On The Wire

I have spent a lot of my career as a network engineer in front of packet sniffers.  I've often heard it said that "the wire doesn't lie", and that's true - as far as it goes.  But packet sniffers (and other analysis tools) don't show you the "truth" either, unless you define truth as just a set of data points.  "Truth" as most of us understand it requires deriving meaning from the facts, and sniffers are pretty limited in this aspect.    The following story will (hopefully) illustrate the sometimes difficult process of extracting the truth from the facts, how different people sometimes draw different conclusions from the same facts, and the importance of persistence in the pursuit of the truth.

My company has a big print and mail facility, and years ago some smart person realized that we had enough excess production capacity to offer our services to other companies.  They lined up their first prospective customer and things got underway.  We set up a VPN over the public Internet between the other company's network and our own.  Our Internet connection came into our primary datacenter, and from there we had a private connection to our print and mail facility.

The print server was a Unix system running the standard line printer daemon (LPD), and the client was running the line printer remote (LPR) protocol on a Windows server with Microsoft Print Services for Unix.  Their system would create a print job and connect to our server over the VPN, whereupon the job would be queued and printed.  We went through a short POC phase, and when everything worked to the satisfaction of the print facility and the customer, contracts were signed and work got underway.

Not long afterwards, my manager was contacted by the print facility folks and asked to look into a problem - the customer was reporting occasional problems connecting to the print server.  We didn't have a lot of VPNs at that time, and the combined mistrust of the Internet and VPNs had led to a suspicion that the VPN was the culprit.  We therefore combed through our VPN logs for evidence of problems, as did an engineer at the customer's network.  Neither we, nor the customer's engineer (who I will refer to hereafter as "Steve") found any evidence implicating the VPN.  However, the problems continued.

We had recently gotten new packet capture devices with large amounts of storage, designed for more or less permanent installation in potentially high-value locations on the network.  We deployed one of these to be able to watch all traffic on both sides of our VPN concentrator - one side would see the encrypted stream, while the other would be able to see the unencrypted stream.  We deliberately chose this spot on the network because it was the "furthest out" on the perimeter network - if we didn't see problems here, we could start looking further into the network, while on the other hand if we did, we could safely ignore our internal network.

We didn't have long to wait until the problem resurfaced, and the packet traces were instructive.  We saw the communications from the client, and every one of these communications looked perfectly clean.  Every connection included a complete TCP three-way handshake, what appeared to be normal communications between the LPR and LPD, and a standard four-way teardown of the session.  Not once during the entire time troubleshooting this application did we ever see anything like a failed connection - TCP handshakes were always complete, the client and server communications were always successful, there were never any sessions that died midstream, and the teardowns were always textbook clean.  It's been a long time but if memory serves, I am nearly certain we never even saw a single TCP retransmission.

What we DID see were inexplicable absences of connections - periods of time usually lasting several minutes in which there were no packets of any kind coming from the client's systems.  During these periods there were no TCP attempts at all.  The trace taken outside the VPN box was similarly devoid of traffic.  In effect, the wire was silent.

There was a chance - a very slim one - that something was going on in what little of our network existed outside the VPN box, so we looked for evidence of that.  The VPN box plugged into an ethernet switch, as did our Internet routers.  The switch was clean, as were the routers.  We were not experiencing any interruption of other Internet traffic, our other VPNs were all running clean, and the VPN to this partner was not having problems.

Given that we saw NO issues of any kind, I concluded that the issue was occurring at the customer's end.  I reasoned that intermittent issues on the Internet, or within our infrastructure, would not "respect" the boundaries of TCP sessions - in other words, I would have expected to see problems occur within the TCP sessions.  I might expect to see failed TCP handshakes, or some irregularity within the print jobs streaming over.  The fact that this never happened led me to believe that something was preventing our customer's systems from even attempting to connect for short periods of time.  I should also mention that our print server was handling lots of internal jobs with no issues.

I packaged up my sniffer traces and forwarded them to Steve, outlining my conclusions and the reasoning behind them, and asking him if he could take local traces of his own and confirm whether his systems were making any attempt to connect.  He promised to do so.  I didn't hear from him for a while, and we continued to get reports through the print facility that the customer was complaining about the connection problems, so I reached out to Steve again.

I asked directly whether Steve had taken traces.  He said he had.  I asked if he had seen any irregularities in the traces on his side.  He said he had not.  This might have been an error on my part - maybe I should have asked whether he saw anything at all when the dropouts were occurring, the same "silence on the wire" evident in my traces, but I didn't think of it.  I did ask if I could get copies of his traces to compare with mine, and Steve said he would share them, but they were never forthcoming - so to this day I do not know if he actually took any.

I reported my research to my management, along with my conclusion that the problem must be at the client's end, and that there was not likely anything we could do about it.  Their response was to urge me to keep looking, and so I did.  I took a dozen or more traces, all containing perfect, complete sessions, and usually also containing some of these weird silent periods.

I dug into the traces and started looking at everything that was there in the communications.  Let's see - client SYN packet, source port 721, server port 515, server SYN-ACK, client ACK, some kind of "hello" packet from the client and a response from the server, print job streaming over, client FIN-ACK, server ACK, server FIN-ACK, client ACK.  All perfect.  Next job - identical except for client source port which is now 722 (and the sequence numbers, of course).  And another - client side port 723, etc.  It did occur to me that the client-side port numbers were almost TOO sequential - there were never any skips, say from 721 to 725 - which made me think the client's system must not been too busy, or that it might be passing through some device that was altering the client ports on the way our of their network.  I also thought most clients should pick their ephemeral ports from a range 1024 and above, but I wasn't too bothered by it.

I looked at this for a while - several times, on various days, over several weeks.  I couldn't find anything that I thought would help, and I slowly lost interest, especially as the rate of complaints dwindled.  After a while I just didn't think about it any more.  It wasn't exactly a matter of "giving up" - I thought I'd done a good job of isolating the issue to the customer's end, and lacking visibility into their network I just didn't see that I could do more to help.

For several months it dropped completely off my radar.  I guess I should have known it would come back though, because unresolved issues never really go away.  One day my manager and director both asked me to get back on the case.  They had been contacted by the manager over the print and mail facility.  The issue was still going on, the customer was complaining more than ever, and now the print facility was having to reboot the server on a regular basis to clear up the issue.  To make matters more urgent, they were hoping to add another customer, but until this issue was resolved they were unable to take that next step.

I started by calling our print server operator - I wanted to know why we were rebooting the server.  He told me that after countless "outages", the customer had requested that we try rebooting the server.  This had been done, and the customer was then able to connect and send print jobs.  Ever since then the customer had gotten into the habit of calling and requesting a reboot whenever they had trouble connecting.

I have to tell you that this made me really angry.  The lack of logic involved here was staggering.  I had clear, indisputable evidence that when the customer was "having trouble connecting", we weren't getting anything from them at all.  I had evidence that every time we did receive a connection request, we answered appropriately - our server always responded.  And in addition, our print server never failed to pick up and handle internal print jobs, which, by the way, were now being interrupted by these frequent reboots.  The whole thing made no sense.

I then talked to Steve.  He told me that since the problem had persisted, they (the customer's company) had started using a command-line utility to check the print server.  They would fire up a command prompt on the system creating the print jobs and run a command that would connect to the print server and display the jobs.  When "the problem" was occurring, the command-line utility would also be unable to connect.  Their operators would sit there rerunning the utility every couple of minutes until they got a successful connection, and then try to restart the print jobs.  Usually after this, they could print again, but sometimes not.

What I got from all of this was that the issue had some sort of time component to it, resolving itself within a few minutes.  I was still convinced that the issue was on the customer's end - there was never any evidence otherwise.  The server reboots simply gave time for the problem to correct itself, but that had always happened anyway.  To test this theory, I advised the print server operator not to reboot the server any more.  I suggested that he didn't have to tell the customer he wasn't rebooting the server - he could just say "OK, try again in a few minutes."  He began doing this, and sure enough there was no difference in the behavior of the whole system.  After a wait of a few minutes, their print jobs would start coming through.

Now that I had gotten the "reboot monkey" off our backs, I went back to the traces.  I took a bunch of new ones and started going through the connections again and again, looking for anything out of the ordinary.  They looked just like the ones from before - client sends SYN packet with client port 721 (or something similar), server side port 515, server sends SYN-ACK, etc.  The connections were as perfect as ever.  In fact, they looked so familiar that I began to wonder if I was looking at my old trace files.  Nope, these were new.  I pulled up some of the old original files to make sure, and it was at this point that I began to grasp the faint outlines of the problem.

The client side port numbers had always bothered me a little bit.  Aside from the fact that they were all under 1024, the range of port numbers was always very consistent - and very small.  The client side port numbers were always within the range 721 - 731.  Eleven port numbers, always in succession, reused again and again.  I would see a connection from port 728, 729, 730, 731, then it would loop back around to 721.  And every so often, usually after a bunch of successful connections, silence on the wire.

I began to wonder if there was some issue with port-exhaustion - this thing was using such a small pool of client-side ports.  I wondered, how quickly is a client-side port allowed to be reused?  I dug out my trusty copy of Richard W. Stevens' TCP/IP Illustrated Volume 1 and found the TCP state diagram.  I saw something called the "2MSL" wait state which occurs before the socket is fully closed.  The MSL is the "maximum segment lifetime" which is supposed to be two minutes.  The standards for the protocols we still use today were created back when computers and the Internet were MUCH slower, and back then there might be conditions on the network that could cause a packet to arrive late - very late indeed.  Anyway, the standards also said that the partner in a TCP session which initiates an active close (through the use of a FIN-ACK packet) MUST then hold the connection for two times the maximum segment lifetime (2MSL) before it can consider the socket closed.

In other words, the connection doesn't truly close for four minutes after all the teardown messages have been exchanged.  If you do a "netstat" command on a system you will often see sessions in something called the "TIME_WAIT" state.  These are sessions waiting out the 2MSL period so the system can close them.  Basically, from the time I observed the client FIN-ACK and other teardown packets, four minutes would have to elapse before the client-side port number would again be released to the operating system for reuse.

With respect to our customer's printing problem, the issue was now in pretty sharp focus.  For some reason, the client was only using 11 port numbers (721-731).  After 11 successive print jobs, if the timespan of those jobs was less than 4 minutes, all of the TCP sessions involved would be in the TIME_WAIT state.  Until the oldest sessions completed the 2MSL wait, there would not be any available ports for new sessions.  But why were they using such a small pool?

The answer to this is in the RFC which defines the LPD/LPR services.  RFC 1179 says that "The source port must be in the range 721 to 731 inclusive."  To be honest, I didn't actually find the answer in the RFC - but some Googling led me to a Microsoft Knowledge Base document which described exactly the problem we were seeing, which mainly occurred on a specific version of Windows Server, and with a suggested fix - a registry setting to cause the Print Services for Unix to use standard ephemeral ports from the much larger pool above 1023.  The document outlined that the command-line utility they were using also drew client ports from the same range, so effectively, if all the ports were tied up in TIME_WAIT sessions, the utility would similarly fail to connect.  In fact, when they DID connect with the utility, they were actually putting an available port into a 2MSL wait again!

I sent an email to Steve, asking what version Windows Server was in use.  He confirmed the affected version.  I then sent an email telling him what I thought was happening - that his server was using a very limited range of client side ports, that the speed/volume of print jobs was outpacing the system's ability to clear the sessions for reuse resulting in client port exhaustion, and that there was a suggested fix involving a registry setting.  I even sent a link to the Knowledge Base document.

Steve responded that he would look into it, but my interpretation of his response was that he wasn't sure he believed me.  I couldn't do much about that - whenever you are dealing with another company, and that company is your customer, and when you are telling a peer engineer that you have remotely diagnosed a problem in his systems...well, there may be some resistance to the idea.  So I waited to see what would happen.

What happened was exactly zilch - the problem persisted, and again we were being begged by our print facility manager to intervene.  But this time, there really was nothing more we could do - except that I now understood I would have to force the issue with Steve.  I wrote - to my management, and copied to Steve - an exhaustive (and at times pointed) accounting of the entire troubleshooting effort, including my early work and their (accurate) conclusions about the problem being in the customer's network, my difficulties getting information out of Steve, my work in understanding and putting a stop to the reboots, and finally my conclusion - backed up by sniffer traces and documentation from the customer's server OS vendor - that the problem was caused by port exhaustion.  I included the Knowledge Base document for reference, and stated that my department was finished working the issue, once and for all.

Within a week of sending that email, Steve (or someone else at his company) had made the registry changes to their system, and the problem never surfaced again.  It had taken six months, dozens of hours looking at traces, emails back and forth with an incompetent or unhelpful peer, a lot of pain and suffering on the part of our print facility, and research into another company's network and systems, but the problem was finally resolved.

Whenever I work a problem - especially when it's such a challenging and painful one - I always look for "lessons learned."  This one was particularly fruitful:

  1. Persistence, persistence, persistence - over and over throughout my career troubleshooting, I have run into problems where it seems like I just stare at packet traces or logs until my head is about to burst, and then, like a ray of sunshine coming through the storm clouds - the solution appears.  This case was somewhat rare in that there was a period where I accepted that the problem was "solved" even when it wasn't.  I had felt that in proving the problem was on the customer's end, that my work was done.  Figuring out when to stop, when enough is enough, is part of maturing as a troubleshooter and I may do a blog post about that later - but in this case, my real customer was always the people at my company's print and mail facility, and until things were completely resolved, my work was not truly done.
  2. Troubleshooting a problem that exists on a foreign network is really hard, but NOT always impossible - this probably doesn't require much more explanation than what is available in the story above, but I've often seen network engineers focus on this sort of "us VS. them" strategy in problem solving.  The idea is that if we can prove it isn't US, then it must be THEM, and we can't do anything more.   A lot of times there is a bit of shaky logic employed, something like "Well, our printer works fine for everybody else, and we don't have any other VPN or Internet problems, so it's not us."  Of course, Steve always insisted that they were not having problems sending print jobs to anywhere but us, which is possible if we were the only LPD server they were targeting.  All of this may have been true - and in my case I had even better evidence from my traces that we weren't even receiving communications from the customer's network - but the fact remains that the answer was always right in front of me, in the packet traces I was so proud of analyzing. 
  3. Dealing with "peers" on other teams or at other companies can be just as challenging as the technical act of troubleshooting - "Steve" is a prime example of something I've dealt with many times over the years.  He was either unwilling to really look at his network and systems, or incompetent, or both.  I believe that if he had actually performed network sniffer traces he would have noticed that there was NO communication coming from his print system during the outages, which would have led him to the same conclusion I had reached.  The fact that this didn't happen, combined with his apparent unwillingness to share the traces he claimed to have taken, leads me to believe he never did them at all.  Of course it's possible he did the traces but just didn't interpret them properly.  I don't suppose I will ever know.  I also strongly suspect that the Windows Server in question would likely have been writing event log messages regarding the connection problems, had anyone over there cared to take a look.  But Steve was what I had to work with - I had to continually reach out to him, probe for information and prod for action, while trying not to upset or insult him, in order to finally get the action required.  It was neither easy nor pleasant.
  4. No matter how much you know, there's always room for more - going into this problem I thought I knew the basics of socket communications pretty well.  But it took far too long for me to notice the oddly low port numbers, or the small pool in use.  There's a lot of detail in packets and packet traces, and it takes diligence to spot patterns like these.  
And so ends the tale of the Silence On The Wire - hopefully you made it here to the end, and that it was worth coming along for the ride.



Thursday, February 19, 2015

A Long, Sordid Tale

I have a story to tell.  It was years in the making.  Hopefully it won't take that long to tell.  I can't promise you'll enjoy it, but it has a (sort of) happy ending and there are some lessons in it, if you make it to the end.

Did I ever mention I helped start the first Internet service in Camden County, GA?  Back in the day there was no local dialup service, so me and some friends at the computer store where I worked got a wild hair to bring Internet to our area, and we did it (with some funding from a partner and a lot of help from a provider in a non-competing area).  After two years we had 2000-plus paying customers!  We also had a pant-load of debt, and we were really lucky to get bought out before we got into real trouble.  I walked away from that experience with no debt, and a crash course in how the Internet works which served to get me my present job as a network engineer with a big company.

Our local phone company is TDS Telecom.  TDS mainly serves Wisconsin, but they have gotten into a number of other markets, and they had bought the local mom & pop phone company here in the early 1980's.  A few years after I had moved on to my new career, they began offering DSL service.  I live a few miles outside of town in a sparsely populated area, so I knew it would take a while to reach my neighborhood, but eventually I got 1.5 mbps service - yay!  It was incredible!  After a year or two they bumped the speed to 5 mbps.  Life was great.  Then they began offering 10 mbps and I was in heaven.

Around that time the first set top streaming boxes were starting to hit the market.  I was an early adopter of the Boxee Box - a little box design by D-Link, running Boxee software (which was a fork of XMBC).  There was a decent amount of content and it was fun to play around with.  One of the apps for the Boxee was MLB.TV, which I was really excited about.  I signed up and started watching baseball on it.

But it wasn't long before I started noticing problems.  The games would stutter and buffer a lot.  The MLB app on the Boxee supported automatic quality adjustments, and the picture would get really bad, and the constant buffering made it really hard to enjoy a game.  Oddly enough it seemed the service was fine on the weekends, especially in the mornings, but in the evenings on weekdays it was terrible.

I did some speed tests and found that my evening speed was pretty bad - almost always below 5 mbps, sometimes dropping down to 1 or 2 mbps.  The MLB.TV FAQ said I needed a pretty solid 4 mbps to watch a game, so that seemed to be my issue.  I called TDS tech support and told them I was having some speed issues.  They checked my line, "reset my port", pronounced it all clean...but I kept having issues.

Over a period of some six months, I dorked around with everything I could.  I got a new router.  I ran a clean phone line from the box outside the house straight to my modem, no splices.  I ran wired ethernet to the living room for the Boxee.  And when 15 mbps service became available I signed up for that.   Oh yeah, I also called tech support a half dozen more times - checked my line, reset my port, etc.  But no matter what I tried it just didn't improve.

Finally I called tech support and insisted they escalate to the next level of support.  After a long wait, a nice lady came on the line to explain that my problem was oversubscription.  Basically TDS had sold more service than the equipment serving my area could support.  She said they had to order a new switch, and that currently it was scheduled to take about 3 months to fix.  She assured me that there was a running ticket on this problem and that I should just wait it out.

So I did.  I waited 3 months, and then I waited an extra month for good measure.  And when it still hadn't gotten better, I called back.  At first, the tech tried to tell me TDS would never give out information about their installation schedules, but when I insisted on talking to a supervisor, I got an answer - it would be another 3 months, there had been a backlog of some kind, the switches were all custom built, whatever.

I waited 4 months again, and it still wasn't any better so I called back.  And the answer was, again, about 3 months.  Given that TDS had no competition for high speed Internet in the area, I had no real choice but to wait it out.  Another 4 months went by, and still things didn't improve.  It was now 12 months since I'd been told about the oversubscription, more like 18 since I'd been calling in about the problem, and it looked like it would never get better.

Around this time I was talking to a friend at work who suggested I contact the Public Service Commission.  I decided that before doing so, I would call TDS one more time, but I would tell them that I was GOING to call the PSC.  And this time, I got a little different response.  A supervisor offered to give me the name and phone number of a guy I'll just call "Chuck" (not his real name), who was in charge of scheduling all the big installation work for our area.  "Chuck" would be able to at least tell me the real scoop on the schedule.

The first time I got to talk to Chuck, I was sure this was a quality guy who would get things straightened out.  He told me the current switch was only capable of handling a couple of DS3s (45 mbps connections) and couldn't be upgraded.  However, they were building an entirely new switch and when it was in, they would have a couple of gigabit uplinks - something like 20 times the available bandwidth.  He couldn't give me a date but thought it would be ... about 3 months.  Oh well, I thought, at least now I know what's going on.

It turned out to be more like 6 months, but the new switch got installed, and one day a technician showed up to let me know they were moving my connection to the new switch.  Once it was done, I checked the speed...and I was now getting a whopping 5 mbps.  What happened to my 15 mbps service?  Well, the new switch was a mile further away, and the way DSL works is very distance-dependent.  The good news was that the speed no longer dropped out in the evenings, but I was barely getting the kind of speed you need for good streaming.

And that was a real problem, because by then I had chosen to become a "cord-cutter." I had canceled my DiSH Network service - no sense paying $100 a month for a bunch of channels I didn't want to watch - and I had gotten a Roku box to replace the Boxee, and well, 5 mbps was just barely getting the job done.  That mattered because as a network guy, I wanted to use my Internet service for more than one thing at a time, not to mention I was now in a long-distance relationship and had become very dependent on Skype.

Chuck told me he was very sorry, this was not the solution he had envisioned.  He told me there was still a chance he could get a new switch put in to replace the one closer to my house, and that he would try, but it was going to take a while and this time he was not willing to give me a projected date.  I thanked him for everything he had done, and gave up...for a while.

About 6 months along, I decided to check in with Chuck and see if there had been any news on the new switch.  And I was astonished to hear that he had actually already gotten it done!  However, he cautioned that it would be a while before they could cut anyone over to it.  There was some story about new software at TDS for managing customer orders and it was going to be, you guessed it, about 3 months before he could move me to the faster connection.  By now I understood that "about 3 months" was TDS code for "no freaking clue", but it looked like there was light at the end of the tunnel.

I waited 5 months this time before bothering Chuck again.  I was really getting tired of this.  My struggle to simply be able to order, pay for, and enjoy the service TDS was advertising had been going on for YEARS.  I wrote my frustrations into a note for Chuck, doing my best not to burn bridges but letting him know I felt I was getting jerked around.  I don't know if that made any difference, but a few days later, something happened.

I came home from work to find my dial-tone cut off.  I actually wrote another blog post about this so I won't detail it all here, but the next day my DSL went offline as well, and it was a couple of days before it got sorted out.  They had cut me over to the new switch but hadn't bothered to configure service on the line.

With this fixed, I now had pretty good speed - somewhere around 12 mbps, which is decent for DSL.  I was happy for a few months, until the dropouts started.  This was a new problem.  I could surf, or watch TV, for minutes or maybe even an hour or two at a time, and then...dead air.  The DSL lite on my modem was never going out, but I couldn't get any data in or out.  Resetting the modem would immediately fix the problem, or sometimes it would resolve itself without me doing anything, but it wouldn't stay on for good.

I was quite certain I wasn't having a problem with local equipment in the house, and when I called TDS tech support, I tried to convince them that they didn't need to send a technician.  However something at TDS had changed - they now had their phones being answered by an off-shore outfit, maybe in Jamaica judging from the accents.  These techs simply wouldn't even consider looking at anything until someone was dispatched to the house.  I got an agreement from tech support that their field technician would only look at the line outside the house, and would not enter the premises, and I let them schedule the visit.

When I got home from work the next day, the field tech was there.  My girlfriend was there talking to the guy out in front of the house, and he informed me he had gone into the house and replaced my modem!  It turned out my girlfriend thought I expected this and would be OK with it.  The tech said my old modem was no longer supported, he had failed to find a problem hooking up to the line outside the house, and so assumed the problem must be inside.  I tried to explain that in order to see the problem he would have had to stay connected outside for up to an hour, but no dice - and he was not giving my old modem back.  I'm not a physical kind of guy but I had a fleeting thought of wrestling the old modem from his grasp.

He had replaced my old "modem" with a new wireless plus 4-port ethernet router.  He had pulled my old router out - because he didn't have any idea how to set all my stuff back up - and left me with nothing working at all except his new router.  I was furious.  I spent the next couple of hours disabling the wireless on the new router, disabling DHCP, putting it in bridge mode, and getting my own router back online.  And within an hour, the connection was dropping again.

Just to make sure it wasn't my equipment in the house, I removed my router, re-enabled wireless and DHCP on the router they had provided, and basically made everything just like a standard customer.  I made another call to the Caribbean tech support people, insisted they escalate to the next tier, and got a guy who could actually troubleshoot.  He quickly diagnosed a DHCP problem causing their system to give me addresses that had already been given to other customers, and he got it fixed.  This should have happened on the first call, and my equipment should never have been touched.

Still, I left the phone company's router (an Actiontec V1000H) in place as my primary router for the time being.  My girlfriend was visiting for an extended stay, and every time I tore everything out and reconfigured things it would keep her from being able to watch TV or use her computer on the Internet, so I decided to wait until her visit was over. It turned out to be a good thing - because within a few weeks, the connection was dropping out again, and I was able to point out to TDS that none of my own equipment was at fault.  These dropouts were different - with the DHCP problem, the dropouts lasted for quite a while, but with these new ones, the connection would come back quickly.  They would sometimes come in waves of several dropouts in a row within a few minutes, then things would settle down.

Looking at the router's status page I could see that we were getting numerous retrains, a symptom that the combination of errors and weak signal-to-noise ratios were causing the router to renegotiate the speed.  When rebooted, the router would sync up at about 12-13 mbps, but after a set of these retrains it would drop down to about 6 mbps.  Another call to tech support, another site visit - I stayed home for this one to make sure they didn't enter the house or touch my equipment.  The field tech replaced the box on the outside of the house, checked some things at a junction box at the street, made a couple of trips down the road to the switch...but when he was done, he said he didn't think it was any better and wanted to get a more experienced technician to come out the following day.

I couldn't stay home again, so I left my girlfriend in charge and went to work.  It's too bad, really - I have no idea what this guy did, but whatever it was worked wonders.  Ever since he left, I have had a solid 15 mbps connection (I NEVER got 15 before) and it never drops out.

I had one final issue with all these changes.  I recently bought a new printer, and my desktop computer uses mDNS protocol to discover printers on the network, and it wasn't working.  It turns out the Actiontec doesn't pass mDNS (or at least I couldn't find an option anywhere in the configuration to allow it).  Once again I disabled all features of the Actiontec router (turned off WiFi, turned off DHCP) and put it in bridge mode, and I hooked up my trusty Asus router.  The printer problem was solved.

As I have thought through this entire affair in the process of writing it out, I have tried very hard to come up with some lessons learned.  The one clear thing in all of this is that when things aren't working, you have to be persistent.  It may not be possible to force a provider to do their job and make things right, but you can bug the crap out of them until they do.  My biggest regret was allowing them to put me off for months at a time with their repeated promises that they were fixing things.  I really should have called the Public Service Commission, and early in the process.   If I had had the option of switching to another Internet provider I'd have done so years ago, but I didn't have that choice.  It's taken years to get things right, and it's difficult to say I'm satisfied given the painful process.  But I have what I'm paying for, and I suppose that's better than where I was.

Now...TDS is starting to run fiber to the area, and will soon offer 50 mbps service.  I wonder if I will have the guts to sign up for it?

Sunday, February 15, 2015

Chromebook - How I Work, Apps I Use

I purchased my first Chromebook a few months ago, knowing full well the "limitations" of a device that "only" runs Chrome, and which requires an Internet connection for most functions.  Of course Chrome is actually a platform that can support a variety of apps and extensions, all of which make it feel like much more than a browser.  The purpose of this post is to share the apps I find most useful for day to day tasks, what I like (or don't) about them, and to mention a couple of peripheral devices I find handy.  I intend to update this post on occasion as I find new things that are useful.

One of the great things about the Chrome Web Store is that like other "app" stores, there are frequently multiple choices for apps that fill a given need.  Since nearly all of the apps are free, it doesn't hurt to try different things until you find a good fit.  In most cases I've found things that work at least "well-enough", in a few cases I've found things that are fantastic, and in rare cases I've simply found the current choices just aren't very good (yet).

Mouse

I have a Toshiba Chromebook 2, which happens to have a nice trackpad - large, smooth, accurate.  However there are just times when I prefer to use a mouse.  The Chromebook supports just about any standard mouse, whether connected by USB, Bluetooth, or receiver.  My choice is the Logitech M325.  It's small enough to carry around, big enough to actually use, and battery life is very good.  The M325 uses the Logitech Unifying Receiver, a tiny USB wart that doesn't stick out very far (I leave mine plugged in all the time, whether or not I'm using the mouse).  No extra software is required to use the mouse, but if you want to link more than one Logitech device to the receiver, there is now a "Logitech Unifying For Chrome" app you can install from the Web Store which will allow up to six devices to use a single receiver.  

Printing

By default, you can "print to PDF" and have the file saved locally or on your Google Drive.  If you want an actual hard copy printout straight from the Chromebook (i.e. without having to transfer the file to another computer to then print), you will want a Google Cloud Printer.  Quite a few printers these days are Cloud Print-ready, meaning they do not need to be shared from a regular PC or Mac.  My choice was the Brother HL-L2380DW, a black and white laser printer with copy and scan features.  

Storage

Local

The Chromebook will (usually) have either 16 or 32 GB of local, SSD storage (actual available storage will be less due to spaced used for the OS).  Although the design goal of Chromebooks is for you to use cloud storage for most of your files, there are times when it's useful to have some things stored locally - for example if I'm going on a trip and I'm not sure I'll have solid Internet connectivity, I might want to load up some movies to take along, etc.  Pretty much every Chromebook has an SD card reader.  They also have one or more USB ports, so various types of USB drives can be connected (from thumb drives to hard drives, etc.).  My choice was to buy a 64GB microSD card and to leave it in the machine at all times.  It was pretty cheap, it's fast, and at least on the Toshiba it mounts flush.

Online

Google offers new Chromebook users 100GB of Google Drive storage free for two years.  Then if you wish to continue using it you can pay a monthly or yearly fee for whatever amount you need.  By that time you should have a fair idea how much you actually need, which may be far less than the 100GB.  As fond as I am of Google, I hadn't used by Google Drive much prior to owning the Chromebook.  Now that I have one, I have added Google Drive access to my desktop Mac at home, and my desktop PC at work, and it's so easy to move things around that I am wondering why I didn't do it sooner.  It's SO much easier than copying files to thumb drives, emailing them around, etc.

Email

I have been a Gmail user for about as long as the service has been available.  Lately I have switched to Google's Inbox interface for my Gmail accounts.  I'm not entirely sure I like Inbox all that much, because I prefer to delete old emails as opposed to archiving them, and they buried the delete function in a menu.  Also it performs kind of poorly on the Chromebook.  I'm giving it some time to see if they can do some optimization.  However, I like they way it organizes and groups thing, I like the "helpers" or accelerators or whatever they call them which scrape your emails and present helpful information directly in the interface (like information about upcoming flights, and tracking information for purchases).  I have several Gmail accounts, and I nearly always have at least two Inbox tabs open when I'm on the Chromebook.

Chatting (Instant Messaging)

Right now all I'm using is Google Hangouts on the Chromebook.  I don't do much Facebook messaging but I can do that within the standard Facebook web interface just fine (if I ever install a Facebook messaging app that I like, I'll add it to the list).  I have not yet tried any of the options for linking to my phone for SMS messaging.  

Office Suite

I use Google's Docs & Sheets.  If I needed to do a presentation I'd use Slides.  I do not work on "work" documents on my Chromebook, it's only my personal stuff, and for that the Google choices are more than adequate.

Calendar

I have been using Sunrise Calendar for a while now and I really like it.  It interfaces with LOTs of different calendar systems and merges everything into a single, useful view.  I like that it pulls in Facebook birthdays and events.  I also like that it runs in a separate "app window" on the Chromebook.  I almost always have Sunrise open on my Chromebook.

Twitter

I have installed Tweetdeck.  It has a great, multicolumn interface, looks awesome (it's one of those apps that make you forget you're "just" running Chrome), runs in it's own window...I almost always have it up and running.

Reddit

Note - I'm not a heavy Reddit user, certainly not an expert on it.  I have installed Reditr, a neat Reddit reader that runs in it's own app window and has nice multicolumn display, similar to Tweetdeck.

Facebook

Obviously I just browse to Facebook like normal, but I have installed Social Fixer For Facebook which cleans up a lot of the clutter, makes it easy to preview images, etc.

Instagram

I mainly use my phone to browse Instagram activity, but when I want to do it on the Chromebook I use Pixta.  It has a clean interface, allows you to comment, "like" images, and download them for offline use.

Notes

I used to use Evernote, but I was never a power user.  When Google introduced Google Keep, I ditched Evernote.  Since Keep synchronizes with the Google Cloud, and since there is a nice interface for it on my Android phone, I'm a happy guy. 

Music

This is a crowded category, and I'm not entirely sure I'm done trying to figure out what I ultimately want.  For years I loaded my personal music on my phone's SD card, and for those times that I wanted streaming music I used Slacker.  But for about a year I have been using Google Play Music, and yes, I am subscribing to the paid version.  I have uploaded almost 3000 of my own files, so I can listen to my stuff any time I have an Internet connection.  Google Play also has lots of curated "radio stations" which are tied to various moods and genres, and I like that I sometimes find new stuff (or get reminded of old stuff) that I like.  It's not perfect, but it's filling the need.  I also sometimes listen to TuneIn Radio, when I want to listen to actual radio stations.  There is no good app on the Web Store for TuneIn at this time.  Also, some types of streams are not currently supported on the Chromebook, so your mileage may vary.

Summary

That's it for now.  I'll be updating and adding to this list as I go.  Feel free to comment, and make some suggestions.  I certainly don't mind trying new things!

Saturday, February 14, 2015

Cloud Printing with the Brother HL-L2380DW

A couple of months ago I began using a Chromebook as my primary mobile computer.  I mostly use it for email, calendar, and for watching a few interesting web services like Twitter and Reddit.  Occasionally I found myself using Google Docs or Sheets to work on documents, and though I could save them to Google Drive and then access them from somewhere else, I thought it would be nice to be able to print them directly.  I had heard of Google Cloud Print but hadn't tried it out.

I read that I would need either a Cloud Print-ready printer (network connected and with built-in software for registering to my Google account), or a printer attached to a regular computer that could be shared through Cloud Print.  As luck would have it my basic black and white laser printer was in pretty bad shape and it was just about time to replace it, so I decided to shop for a new printer.

My needs:  I wanted laser due to low cost per page, and because laser toner doesn't dry up or otherwise "go bad" like inkjet ink.  I dont' print that often, sometimes going weeks without, but when I want to print, I don't want to mess with cleaning inkjet heads.  I wanted black and white - I have no use for printing color.  And of course, I wanted a Cloud Print-ready printer.

I did some searching for "cheap black and white laser printers" and found several reviews.  From those I was gravitating towards a basic model of Brother printer.  Although I do a lot of shopping online, for some things I prefer to shop in a traditional brick-and-mortar store, so I visited my local Staples.  They had the printer I wanted, but due to a sale, they had the HL-L2380DW at the same price.  This model has the same laser printing engine as the one I had selected, but adds copy and scan capabilities.  It also offered a $30 rebate for in-store credit which I figured I could eventually use for toner.

As you would expect, the Brother is fairly easy to set up.  It connects to the network either wired or wireless.  Wired with DHCP is easier of course, but wireless wasn't much harder - a built-in color LCD touchscreen made it easy to select my wireless SSID and supply the password.  I was able to access the web interface of the printer with no problem moments after setting it up.

My desktop computer at home is an iMac, a few years old running the latest Mac OS (OSX 10.10.2 Yosemite), and I actually ran into a couple of issues related to that, and the fact that the Mac uses Bonjour (mDNS) to automatically locate printers.  It turns out that my home router (an Actiontec supplied by my ISP) doesn't pass Bonjour traffic.  There's a long story I won't get into here, but I had a replacement router ready to go that I was already planning to use (a really nice Asus model) - substituting that for the Actiontec resolved this issue so that my Mac, and the Brother ControlCenter app, could "find" the printer and work properly. NOTE: I have since learned that it is possible to get mDNS working through Actiontec routers, although there's no specific option for it - it has to do with allowing multicast traffic over the wireless, and I'd suggest Googling around for some instructions if you care to try it.  I prefer the Asus router anyway.

Setting up Google Cloud Print was what I really wanted to try, and as it turns out was quite easy (contrary to some of the reviews I've read).  The printer's web interface has a "protocols" page where you can set up a variety of printing capabilities, and Google Cloud Print is clearly listed.  You provide the name by which you want to see the printer listed, click a "Google" button and you are redirected to Google's web site where you authenticate to your Google account, and then you can see the printer listed on a site where you manage your cloud printers.

Using the printer is just about as easy as you could possibly want.  From Chrome (which you are automatically using on a Chromebook), you just choose to print from any tab you're viewing.  You select your printer just like you would in any other app, make whatever adjustments to the settings you like (portrait vs. landscape, single or double-sided print, resolution) and print.  The job is transmitted to the cloud, and within a minute or two the printer picks up the print job and executes it.  To be honest I am not sure of the technical process involved in the printer's connection - my printer is "sleeping" most of the time, yet it always picks up the jobs and prints them quickly, so I am guessing it runs some kind of background process to periodically check in with Google.

What I've described so far is the most basic functionality you would ask from a Google Cloud Printer, and these days there are lots of other makes and models that will do as well.  However, this specific model (the Brother HL-L2380DW) does a number of other neat things:

  1. Copying - the printer has a glass-top scanner that allows you to do basic copying.  This model does not come with a sheet feeder so it will not be convenient for copying a large document, but if you have something just a few pages long it's not too tedious and does the job.  The function includes a couple of neat modes - if you want to copy a document and have the printout use both sides of the paper, you can tell the printer to do that using options on the LCD touchscreen and it will prompt you through the process.  It even has a special routine for copying both sides of an ID card and printing both sides on a single page.
  2. Scanning - the device functions as a full-color scanner.  Using the supplied Brother utilities you can scan and have the document stored or automatically opened in a variety of applications.
  3. Mobile device connectivity - you can certainly install Google Cloud Print services on your phone or tablet, but Brother has an app you can use to more directly access the device for printing and scanning from mobile device. The app allows your device to connect over the network and print just about anything from the phone, or to scan a document and have it wind up on the phone.  Note that while the app has NFC, this model printer does not.
  4. Web services - Brother has a service called Brother Web Connect which allows the printer to (sort of) directly interface with a number of web-based services including Google Drive, Flickr, Facebook and several other things.  Using it is a little clunky - you have to browse to the Web Connect web site, select the desired service and obtain a temporary code that allows the printer to access the service for 24 hours.  Then, you can use the printer's LCD interface to download and print from the selected service, or scan and upload.  The cool thing about this for me is that I can scan and have the document directly stored to Google Drive, which then makes it available on my Chromebook.
All in all, I have found the Brother HL-L2380DW to be a relatively full-featured device for the price.  The Google Cloud Print functionality is easy to configure and works as advertised.  The addition of various copy and scan capabilities, including the ability to scan to Google Drive, is a nice plus.  I would recommend this printer to anyone who needs a solid black and white laser with Cloud Print capability.

Wednesday, February 11, 2015

Two Months In - My Chromebook Experience

Like most folks these days, I want to be able to connect to and use the Internet pretty much anywhere, and that means having some form of mobile computing device.  And like many others, I find that I really can't do everything I want on a smartphone.  Small screen size, the lack of a fully-capable browser, lack of a real keyboard, all of these make certain activities on the smartphone a real chore.  A few years ago I bought an Android tablet with a detachable keyboard, hoping that it would fill the bill, but tablet browsers are still not able to do some of the things a full browser can do.

So it was that late last year I began to look seriously at trying a Chromebook.  In the interest of full disclosure I should tell you that I didn't care to buy a Windows computer, at any price - I stopped using Microsoft products on my home computers 10 years ago, only using Windows now at work where I don't have a choice in the matter.  Plus, to be honest you need a decent amount of computer to make Windows worth having.  You can buy a really cheap Windows computer these days but I imagine the experience is not going to be what I'd want.  Microsoft proudly points out that you can run "real" Microsoft Office and Photoshop on these things, but I sure wouldn't want to.

When I began looking at Chromebooks, the first thing I had to do was get my head wrapped around  the hardware specs - these machines nearly all had slower, low power processors, very little RAM, and tiny amounts of onboard SSD storage.  I learned that Chromebooks have a slimmed-down operating system designed to support online connectivity and to run the Chrome web browser and not much else, which is why they get away with such low powered CPUs and so little RAM, and that the expectation is for user data to be mostly stored in the cloud, which is why there is so little local storage. 
However there are other hardware and design considerations that made most Chromebooks pretty unattractive.  Keyboard quality varies a good deal but is about what you would expect of a really cheap laptop.  Sound is so-so.  Screens are functional but not great - usually small (most models are about 11 inches), and screen quality is not much to speak of.  

For these reasons I eventually chose the Toshiba Chromebook 2 which hit the market late last year.  The CB2 came with 4GB of RAM (on the high side of what's normally available), 16GB of SSD storage (the usual amount), and uses an Intel Bay Trail Celeron CPU (a version of the Celeron designed to conserve battery power at the expense of a bit of speed).  The main reason for picking the Toshiba was that it has a 13-inch screen with a FULL HD (1080p) screen that uses IPS display technology - in other words, this is a Chromebook with a screen that is as good as a Macbook Air.  The Toshiba also has decent audio (although the built-in speakers aren't terribly impressive, but they are not bad either).

I've been using the Chromebook daily since around Thanksgiving last year, though not as my primary computer - I keep it up at work with tabs open to a couple of Inbox sessions, usually a tab open to CNN, a tab for a flight tracking site I spend way too much time on, the Sunrise calendar, and Tweetdeck.  For this type of use, I rarely notice any slowness or issues, except that the Inbox app for connecting to Google Gmail accounts is sometimes a little laggy.

I've used the Chromebook to work on a few word processing documents using Google Docs - the experience isn't earthshattering but it works.

For streaming music and movies, the Toshiba Chromebook 2 has been great.  Honestly I don't use the built-in speakers much, but when I do they provide plenty of volume.  Audio with a decent set of earbuds sounds great.  Video is just fantastic - full screen, 1080p HD streaming video looks phenomenal, especially considering the low power of the CPU.  The screen is gorgeous and compares favorable to much more expensive laptops.

One of the few complaints I still have is that there is no way to scale the UI elements on the desktop separately from the rest of the screen.  Everything is pretty small running at 1080p, and I'm getting to the age where reading glasses are required.  It's easy enough to scale up the text in web pages though.

If anyone has questions regarding my experience using Chromebooks in general or the Toshiba in particular I'd be happy to try and answer.