Discovering the Truth Behind an invalid_file_path Error

Publié dans Coder stories

28 mars 2019

6min

Discovering the Truth Behind an invalid_file_path Error
auteur.e
Stéphane Robino

Full stack developer @ WTTJ

In the Elixir application we use on our platform, we utilize the Elixir library arc to manage file uploads (similar to Carrierwave for the Rails community). An interesting feature of this library is its ability to upload a file from a remote url. A basic use case is a user being able to provide an endpoint for a form rather than a file to upload. Obviously the endpoint must be public and reachable.

Recently, we noticed a lot of invalid changeset errors were occurring with the message invalid_file_path during this operation (changesets allow systems to filter, cast, validate, and define constraints when manipulating structs). The weird thing is that requested files come from our own content delivery network (CDN) and files can be fetched directly via the browser.

Understanding the issue

First, we decided to locally reproduce the error using our own code. Unfortunately, we got exactly the same error message and it didn’t help us understand what was happening at all.

To get a more detailed error description, our second idea was to run a code snippet taken directly from the arc library. The part of the code required is the bit that deals with downloading the file from our remote server before storing it locally or remotely (on Amazon S3, for example).

Here is a simplified version of the arc library code:

# https://github.com/stavro/arc/blob/v0.11.0/lib/arc/file.exurl = "https://cdn.example.com/images/avatar.jpg"options = [  follow_redirect: true,  recv_timeout: Application.get_env(:arc, :recv_timeout, 5_000),  connect_timeout: Application.get_env(:arc, :connect_timeout, 10_000),  timeout: Application.get_env(:arc, :timeout, 10_000),  max_retries: Application.get_env(:arc, :max_retries, 3),  backoff_factor: Application.get_env(:arc, :backoff_factor, 1000),  backoff_max: Application.get_env(:arc, :backoff_max, 30_000),]:hackney.get(url, [], "", options)

After running this snippet we got the following error:

[info] ['TLS', 32, 'client', 58, 32, 73, 110, 32, 115, 116, 97, 116, 101, 32, 'certify', 32, 'at ssl_handshake.erl:1335 generated CLIENT ALERT: Fatal - Handshake Failure - {bad_cert,invalid_key_usage}', 10]{:error, {:tls_alert, 'handshake failure'}}

From this message we can easily tell that we had an SSL problem—the handshake can’t be done. The library can’t fetch the file to store it and so returns an invalid_file_path error message.

Quick fixes

After multiple searches on the Internet, we surmised the issue was due to a problem with either the protocol version (tls/ssl) or with the server name indication (SNI), which corresponds to an extension of the TLS protocol via the hostname it is attempting to connect with at the start of the handshake. Many posts suggest fixing the issue by providing ssl_options for the hackney requests (or an SSL option for HTTPoison).

We tried two fixes by providing additional options for the initial snippet, the first time by forcing the protocol version to tlsV1.2, and the second time by providing the server_name_indication, as seen below:

url = "https://cdn.example.com/images/avatar.jpg"options = [  # ...  ssl_options: [versions: [:"tlsv1.2"]]   # OR ssl_options: [server_name_indication: 'cdn.example.com']]:hackney.get(url, [], "", options)

Both solutions gave us the following successful response:

{:ok, 200, [...], #Reference<0.3402715975.1686634497.65800>}

Before we submitted any fixes to either our app or third-party library, we were interested in discovering why an up-to-date library like hackney needs basic SSL options to fix the handshake.

Digging deeper into hackney

Hackney is an HTTP client written in Erlang and used on many other HTTP wrapper libraries, such as HTTPoison or Tesla in Elixir world.

Regarding arc, our developers decided to use it directly, as seen in the snippet above.

Check the default SSL connect options on hackney

First of all, we wanted to look at the default SSL options used in hackney to perform the HTTPS connect.

%% https://github.com/benoitc/hackney/blob/1.15.0/src/hackney_connect.erl#L314ssl_opts(Host, Options) ->  case proplists:get_value(ssl_options, Options) of    undefined ->      ssl_opts_1(Host, Options);    [] ->      ssl_opts_1(Host, Options);    SSLOpts ->      SSLOpts  end.

The first thing we noted, which is crucial, is that if we provided any SSL options for a hackney call, the default options in the library were overridden and not merged.

The default options are outlined below:

%% https://github.com/benoitc/hackney/blob/1.15.0/src/hackney_connect.erl#L324ssl_opts_1(Host, Options) ->  Insecure =  proplists:get_value(insecure, Options, false),  CACerts = certifi:cacerts(),  case Insecure of    true ->      [{verify, verify_none}];    false ->      VerifyFun = {        fun ssl_verify_hostname:verify_fun/3,        [{check_hostname, Host}]       },      [{verify, verify_peer},       {depth, 99},       {cacerts, CACerts},       {partial_chain, fun partial_chain/1},       {verify_fun, VerifyFun}]  end.

By default, hackney performs certificate verification (against the erlang-certifi Mozilla Certification Authorities (CA) bundle for Erlang) when connecting over HTTPS.

We were therefore able to understand quite quickly that, by providing ssl_options, the verification is not performed and it seems to be a bad idea to provide any custom ssl_options or at least partial ssl_options.

Check the default SSL options on hackney

%% https://github.com/benoitc/hackney/blob/1.15.0/src/hackney_ssl.erl#L62connect(Host, Port, Opts, Timeout) when is_list(Host), is_integer(Port),  (Timeout =:= infinity orelse is_integer(Timeout)) ->  BaseOpts = [binary, {active, false}, {packet, raw},    {secure_renegotiate, true},    {reuse_sessions, true},    {honor_cipher_order, true},    {versions,['tlsv1.2', 'tlsv1.1', tlsv1, sslv3]},    {ciphers, ciphers()}],  Opts1 = hackney_util:merge_opts(BaseOpts, Opts),  Host1 = parse_address(Host),  %% connect  ssl:connect(Host1, Port, Opts1, Timeout).

By default, hackney set the protocol versions to ['tlsv1.2', 'tlsv1.1', ‘tlsv1’, ‘sslv3’], so the SSL module from Erlang tries tlsv1.2 first and, if it can’t connect, will try tlsv1.1. So if our server is set up to accept tlsv1.2, our quick fix above is useless because it’s the default option in hackney.

Our problem therefore looked to be with the certificate verification. By carrying out tests using curl (and other systems), we were able to confirm that these systems are able to verify the certificate without any issue.

$ curl https://cdn.example.com --verbose* Rebuilt URL to: https://cdn.example.com/*   Trying XX.XXX.XX.XXX...* TCP_NODELAY set* Connected to cdn.example.com (XX.XXX.XX.XXX) port 443 (#0)* ALPN, offering h2* ALPN, offering http/1.1* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH* successfully set certificate verify locations:*   CAfile: /etc/ssl/cert.pem  CApath: none* TLSv1.2 (OUT), TLS handshake, Client hello (1):* TLSv1.2 (IN), TLS handshake, Server hello (2):* TLSv1.2 (IN), TLS handshake, Certificate (11):* TLSv1.2 (IN), TLS handshake, Server key exchange (12):* TLSv1.2 (IN), TLS handshake, Server finished (14):* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):* TLSv1.2 (OUT), TLS change cipher, Client hello (1):* TLSv1.2 (OUT), TLS handshake, Finished (20):* TLSv1.2 (IN), TLS change cipher, Client hello (1):* TLSv1.2 (IN), TLS handshake, Finished (20):* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256* ALPN, server accepted to use h2* Server certificate:*  subject: OU=Domain Control Validated; CN=*.example.com*  start date: Jul 20 15:56:38 2016 GMT*  expire date: Jul 20 15:56:38 2019 GMT*  subjectAltName: host "cdn.example.com" matched certs "*.example.com"*  issuer: C=US; ST=Arizona; L=Scottsdale; O=Starfield Technologies, Inc.; OU=http://certs.starfieldtech.com/repository/; CN=Starfield Secure Certificate Authority - G2*  SSL certificate verify ok.* Using HTTP2, server supports multi-use* Connection state changed (HTTP/2 confirmed)* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0* Using Stream ID: 1 (easy handle 0x7f8230802a00)> GET / HTTP/2> Host: cdn.example.com> User-Agent: curl/7.54.0> Accept: */*

The log above shows us that curl is able to connect with tlsv1.2 protocol and verify the certificate chain using CAfile: /etc/ssl/cert.pem.

So, what’s the problem?

Certificate chain verification

Here we can see the chain certificate by using openssl commands, as below:

$ openssl s_client -connect cdn.example.com:443 -servername cdn.example.com </dev/null...Certificate chain 0 s:/OU=Domain Control Validated/CN=*.example.com   i:/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./OU=http://certs.starfieldtech.com/repository/CN=Starfield Secure Certificate Authority - G2 1 s:/OU=Domain Control Validated/CN=*.example.com   i:/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./OU=http://certs.starfieldtech.com/repository/CN=Starfield Secure Certificate Authority - G2 2 s:/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./OU=http://certs.starfieldtech.com/repository/CN=Starfield Secure Certificate Authority - G2   i:/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./CN=Starfield Root Certificate Authority - G2 3 s:/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./CN=Starfield Root Certificate Authority - G2   i:/C=US/O=Starfield Technologies, Inc./OU=Starfield Class 2 Certification Authority 4 s:/C=US/O=Starfield Technologies, Inc./OU=Starfield Class 2 Certification Authority   i:/C=US/O=Starfield Technologies, Inc./OU=Starfield Class 2 Certification Authority...

The openssl command log shows us that the certificate chain is not perfect: The first link and the second link are the same. The hackney implementation of CA verification is stricter than others and the chain must be perfect to be traversed correctly.

So, here is the invalid_key_usage: It looks like we had a problem with the way we were serving our certificate. Our CDN is served by Amazon CloudFront and the SSL configuration by AWS Certificate Manager (ACM). To configure the certificate, ACM requires three separate files: Certificate body, certificate private key, and certificate chain. For Nginx users, the certificate body and certificate chain are concatenated into a single file.

By reimporting our certificate correctly, the error disappeared, without the need for code fixes.

Conclusion

Sometimes a basic error can signal more important problems. The Internet contains a lot of resources to fix most issues but it’s really important to understand why the fixes work. Applying a quick fix without knowing what it does can easily lead to a bigger issue.

This article is part of Behind the Code, the media for developers, by developers. Discover more articles and videos by visiting Behind the Code!

Want to contribute? Get published!

Follow us on Twitter to stay tuned!

Illustration by Blok

Les thématiques abordées