usb:udc:samsung: Zero copy approach for data passed to Samsung's UDC driver

The Samsung's UDC driver is not anymore copying data from USB requests to
aligned internal buffers. Now it works directly in data allocated in the
upper layers like UMS, DFU, THOR.

This change is possible since those gadgets now must take care to allocate
buffers aligned to cache line (CONFIG_SYS_CACHELINE_SIZE).

This can be achieved by using DEFINE_CACHE_ALIGN_BUFFER() or
ALLOC_CACHE_ALIGN_BUFFER() macros. Those take care to allocate buffer
aligned to cache line in both starting address and its size.
Sometimes it is enough to just use memalign() with size being a
multiplication of cache line size.

Test condition
- test HW + measurement: Trats - Exynos4210 rev.1
- test HW Trats2 - Exynos4412 rev.1
400 MiB compressed rootfs image download with `thor 0 mmc 0`

Measurement:
Transmission speed: 27.04 MiB/s

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Cc: Marek Vasut <marex@denx.de>
diff --git a/include/usb/s3c_udc.h b/include/usb/s3c_udc.h
index 734c6cd..6dead2f 100644
--- a/include/usb/s3c_udc.h
+++ b/include/usb/s3c_udc.h
@@ -19,7 +19,7 @@
 
 /*-------------------------------------------------------------------------*/
 /* DMA bounce buffer size, 16K is enough even for mass storage */
-#define DMA_BUFFER_SIZE	(4096*4)
+#define DMA_BUFFER_SIZE	(16*SZ_1K)
 
 #define EP0_FIFO_SIZE		64
 #define EP_FIFO_SIZE		512
@@ -81,9 +81,6 @@
 
 	struct s3c_plat_otg_data *pdata;
 
-	void *dma_buf[S3C_MAX_ENDPOINTS+1];
-	dma_addr_t dma_addr[S3C_MAX_ENDPOINTS+1];
-
 	int ep0state;
 	struct s3c_ep ep[S3C_MAX_ENDPOINTS];