Accessing AWS EC2 Files With SMB Over SSH

The problem
Recently I had to spin up a couple of Ubuntu EC2 instances for development work. However, I didn’t want to give up the comfort of my familiar graphical editor on the laptop, nor install additional software on my local system.

The solution

Setup EC2 instance
Enable ssh port forwarding
Add the following lines to /etc/ssh/sshd_config
# Port forwarding
AllowTcpForwarding yes

Restart sshd
sudo /etc/init.d/ssh restart

Enable Samba
sudo apt-get install samba
sudo smbpasswd -a <user name>
sudo smbpasswd -e <user name>

Make a backup of the Samba config file
sudo cp /etc/samba/smb.conf /etc/samba/smb.conf.bk

Create a Samba share
Add the following lines to /etc/samba/smb.conf
path = <path to share directory>
valid users = <user name>
read only = no

Restart smb and test the config file
sudo service smbd restart

Setup local system
Start ssh port forwarding
sudo ssh -L 8080:localhost:445 -i ~/.ssh/id_rsa <username>@<EC2 instance IP>

We can now connect to EC2’s SMB share using
smb:// (MacOS) or \\ (Windows)

  • Instead of tunneling SMB over SSH, we could have installed sshfs (requires FUSE)
  • I blame Cornell CS for my unhealthy addiction to Visual Studio but that’s another story for another time…

The Definitive Guide to SMR


OpenIoT Summit 2016

Below are some rough notes and thoughts from Linux Foundation’s OpenIoT Summit April 4 - 6 in San Diego.

Event web site:

IoT is all about communication (between devices and to the cloud) and its impact on revenue (where value-added services (compute/control) are located). Most IoT devices today operate in silos, independent of each other (e.g. Nest, Fitbit). This “island of devices” model can help companies capture brand “loyalty” and potential continuous revenue streams at a cost of bad user experience and freedom (e.g. iOS and Apple app store). In a way, this is similar to how the Internet operated before the advent of HTTP and HTML (silos of information separated by incompatible protocols, networks and data formats).

Interestingly, the situation above did not result from a lack of available IoT standards (protocols and frameworks). In fact, it is the very opposite – there are actually too many IoT standards that claim to be the only open and interoperable solution! This results in confusion and wasted efforts in the IoT ecosystem as companies spend more time and resources working on supporting various IoT standards rather than their value differentiator. It does not help the situation when Qualcomm - the founding member of Alljoyn (Allseen alliance) decided to support the competing framework (IoTivity (OIC/OCF)) as well.

Most of the companies at the conference prefer proximal rather than cloud control of IoT devices under the guise of privacy and security concerns. However, this maybe a biased view as none of the companies that presented owns any significant IoT cloud market share (like Google or Fitbit). There is a mistrust of IoT cloud companies by device manufactures - as the former could potentially create a walled garden that can exclude vendors and reap most of the revenues from value-added services.

Finally, there was a lot of concern about security. Linus Torvalds and others spoke frequently on the “unpatchable” nature of IoT which presents a significant security risk, as IoT devices are “real physical devices” rather than just software applications.

Obligatory xkcd

IOT transports and frameworks
  • Transports
    • Wifi
      • Ubiquitous
      • Good range & BW
      • High cost and power
    • Bluetooth smart (low energy)
      • Standard on most mobile devices
      • Low BW
      • Medium range
      • Low cost and power
      • Mesh network is coming
    • Zigbee
      • 802.15.4
      • Low range & BW
      • Low power
      • Very low cost
      • Mesh network
      • Known to have interoperability and interference problems
    • Zwave
      • Proprietary
      • Low BW
      • Medium range
      • Low power
      • Very low cost
      • Mesh network
    • Thread
      • 802.15.4 + 6LoWPAN (IP addressable)
      • Low BW
      • Medium range
      • Low power
      • Low cost
      • Mesh network
    • EnOcean
      • Ultra low BW
      • Energy harvesting - no power consumption!
      • Medium range
      • Low cost
      • Requires special antenna
  • Frameworks
    • IoTivity (OIC/OCF)
      • CoAP (REST model)
    • AllJoyn (AllSeen Alliance)
      • Started as Qualcomm project - now a Linux foundation project
      • D-BUS with RPC/RMI model
    • HomeKit (Apple)
      • Walled garden - requires MFi chip
      • Not widely adopted
      • No killer app from Apple
    • Google Weave
      • Heavy coupled with Google cloud
      • REST model
      • WiFi and BLE
      • Different from Nest Weave… aren’t they the same company?
  • Lightweight pub/sub protocol (store and forward) with reliable bidirectional message delivery
  • Runs on top of TCP/IP (some variant can run on non-TCP/IP network)
  • QoS includes:
    • 0 - at most once
    • 1 - at least once
    • 2 - exactly once
  • Used by FB Messenger
IoT programming model
  • Local/proximal control/orchestration
    • Lower latency
    • Data stays local (privacy + security)
    • Fog model (where data collection and preprocessing is done by border gateway devices)
  • IoT frameworks (from lower level to higher)
    • Zephyr (RTOS for IoT) - for devices too small to run Linux (e.g. no MMU)
      Ostro - IoT Linux (built using Yocto)
    • Soletta - yet another IoT API framework
    • MRAA library
    • Intel IoT services orchestration layer

Keep Calm and Break Rules

Inefficient and incompatible rules
It is a truth universally acknowledged that Some of the Host Managed SMR rules are not exactly conducive to efficient I/O. Even worse, they are incompatible with current host side implementations. For example, a fresh drive with no written data will FAIL ALL read commands sent by the host as ZAC/ZBC does not allow host to read unwritten LBAs. However, this behavior impedes BIOS/OS’ attempt to read partition table/disk signature during system initialization - resulting in either boot failure or long boot time (waiting for retries and timeouts).

Feedback and assumptions
These situations occur when storage vendors we attempt to define protocols and heuristics without adequate and timely host side validation. By the time host side feedbacks are considered - firmware and hardware implementations have already ossified. Furthermore, sometimes storage standard authors arrogantly incorrectly assume that as long as a functionality is defined (in a standard), then operating systems should have already implemented support for it (e.g. ATA sense data reporting, SCT WRITE SAME). When reality hits assumption - we are left with inefficient and incompatible implementations.

Rules are meant to be broken
Based on observations working with Host Managed SMR devices and conversations with fellow developers, here are some rules that we should consider breaking.

1. Allow read beyond zone Write Pointer, returning zeros or host specified data pattern for unwritten LBAs (similar to reading unmapped/trimmed sectors).

This will allow current system initialization procedures to function without error and simplify host side implementation.

2. Allow read/write operations to span zones.

This will eliminate the need to split I/Os along zone boundaries, thus increasing I/O efficiency and simplifying host side implementation (especially when there are multiple zone sizes).

3. Allow write commands addressed to zone starting LBA to implicitly reset zone Write Pointer.

State of zone as write command is issued

State of zone after write command is processed

This is a potentially dangerous proposition - as a stray write could accidentally reset a zone and delete all its contents. However, this proposal will eliminate the need to send and wait for the completion of an extra reset Write Pointer command in the I/O path. Moreover, RESET WRITE POINTER EXT as defined currently in ZAC is a non-queued command, which cannot be mixed with NCQ commands (e.g. common read/write commands) without performance penalties.

4. Allow write commands to start beyond zone Write Pointer, filling gap (unwritten LBAs) with zeros or host defined data pattern.

State of zone as write command is issued

State of zone after write command is processed

This will eliminate the need for host to send unnecessary write commands just to advance the Write Pointer (so it could write to a specific LBA), leading to better performance and simpler implementation.

As mentioned previously - the benefits of having SMR must greatly outweigh the cost of its adoption. It is essential that storage vendors provide an easy and efficient transition path to SMR. Sometimes that means breaking some rules along the way.

Keep Calm and Follow the Rules

Accessing Host Managed SMR devices
Host Managed SMR devices must be accessed using either Zoned Access (ATA) or Zoned Block (SCSI) command sets, which restrict I/O operations that could be sent from the host. This results in simplified device implementation and behavior - as we shifted the burden of shingled writing to host software. Hopefully It is expected, that the host will have access to more compute resources (e.g. memory) and semantic/system-level information than a low-level storage device.

Below is a list of major I/O restrictions that ZAC and ZBC enforce on the host.
Nonconforming I/O operations will be failed by the device.

  • Sequential write - writes have to start at zone Write Pointer (WP)
  • Zone WPs could be reset (to the zone’s starting LBA) by issuing a RESET WRITE POINTER command
  • Writes have to be 4K aligned in SMR zones
  • Reads cannot start or extend beyond the zone WP
  • Read/Write commands cannot span zones (some exceptions may apply)

  • Zone WPs are kept and maintained by the device, they are used to keep host write sequential so we don’t accidentally wipe-out already written data. After each successful write, the associated zone WP is advanced to the next “unwritten” location within the zone (i.e. largest LBA written + 1).

    Here are some concrete examples of the above I/O rules.

    1. Not allowed - write commands cannot start before zone WP.
    2. Allowed - write commands must start at zone WP.
    3. Allowed - write commands must start and end in the same zone.
    4. Not allowed - write commands cannot span multiple zones.
    5. Not allowed - write commands cannot start after zone WP.
    6. Allowed - read commands must start and end before zone WP.
    7. Allowed - read commands can span up to LBA (zone WP - 1).
    8. Not allowed - read commands cannot span multiple zones.
    9. Not allowed - read commands cannot start on zone WP.
    10. Not allowed - read commands cannot start after zone WP.

    In the next post I’ll talk about why we should break some of these rules for efficiency.