mmc: sdhci: Wait for SDHCI_INT_DATA_END when transferring.

sdhci_transfer_data() function transfers the blocks passed up to the
number of blocks defined in mmc_data, but returns immediately once all
the blocks are transferred, even if the loop exit condition is not met
(bit SDHCI_INT_DATA_END set in the STATUS word).

When doing multiple writes to mmc, returning right after the last block
is transferred can cause the write to fail when sending the
MMC_CMD_STOP_TRANSMISSION command right after the
MMC_CMD_WRITE_MULTIPLE_BLOCK command, leaving the mmc driver in an
unconsistent state until reboot. This error was observed in the rpi3
board.

This patch waits for the SDHCI_INT_DATA_END bit to be set even after
sending all the blocks.

Test: Reliably wrote 2GiB of data to mmc in a rpi3.

Signed-off-by: Alex Deymo <deymo@google.com>
Reviewed-by: Simon Glass <sjg@chromium.org>
diff --git a/drivers/mmc/sdhci.c b/drivers/mmc/sdhci.c
index c94d58d..b745977 100644
--- a/drivers/mmc/sdhci.c
+++ b/drivers/mmc/sdhci.c
@@ -72,6 +72,7 @@
 				unsigned int start_addr)
 {
 	unsigned int stat, rdy, mask, timeout, block = 0;
+	bool transfer_done = false;
 #ifdef CONFIG_MMC_SDHCI_SDMA
 	unsigned char ctrl;
 	ctrl = sdhci_readb(host, SDHCI_HOST_CONTROL);
@@ -89,17 +90,23 @@
 			       __func__, stat);
 			return -EIO;
 		}
-		if (stat & rdy) {
+		if (!transfer_done && (stat & rdy)) {
 			if (!(sdhci_readl(host, SDHCI_PRESENT_STATE) & mask))
 				continue;
 			sdhci_writel(host, rdy, SDHCI_INT_STATUS);
 			sdhci_transfer_pio(host, data);
 			data->dest += data->blocksize;
-			if (++block >= data->blocks)
-				break;
+			if (++block >= data->blocks) {
+				/* Keep looping until the SDHCI_INT_DATA_END is
+				 * cleared, even if we finished sending all the
+				 * blocks.
+				 */
+				transfer_done = true;
+				continue;
+			}
 		}
 #ifdef CONFIG_MMC_SDHCI_SDMA
-		if (stat & SDHCI_INT_DMA_END) {
+		if (!transfer_done && (stat & SDHCI_INT_DMA_END)) {
 			sdhci_writel(host, SDHCI_INT_DMA_END, SDHCI_INT_STATUS);
 			start_addr &= ~(SDHCI_DEFAULT_BOUNDARY_SIZE - 1);
 			start_addr += SDHCI_DEFAULT_BOUNDARY_SIZE;