深度学习下的图像视频处理技术-沈小勇
HD video generation from low-res sources Motivation 35 Old and Fundamental Several decades ago [Huang et al, 1984] → near recent Many Applications HD video generation from low-res sources Video enhancement Fundamental Several decades ago [Huang et al, 1984] → near recent Many Applications HD video generation from low-res sources Video enhancement with details Text/object recognition in surveillance videos Motivation etc. CNN-based: SRCNN [Dong et al, 2014], VDSR [Kim et al, 2016], FSRCNN [Dong et al, 2016], etc. Video SR Traditional: 3DSKR [Takeda et al, 2009], BayesSR [Liu et al, 2011], MFSR [Ma et al, 2015], etc0 码力 | 121 页 | 37.75 MB | 1 年前3全连接神经网络实战. pytorch 版
transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, or by any information storage or retrieval system, without the prior written permission of the0 码力 | 29 页 | 1.40 MB | 1 年前3Keras: 基于 Python 的深度学习库
TimeDistributed video_input = Input(shape=(100, 224, 224, 3)) # 这是基于之前定义的视觉模型(权重被重用)构建的视频编码 encoded_frame_sequence = TimeDistributed(vision_model)(video_input) # 输出为向量的序列 encoded_video = LSTM(256)(en 让我们用它来编码这个问题: video_question_input = Input(shape=(100,), dtype='int32') encoded_video_question = question_encoder(video_question_input) # 这就是我们的视频问答模式: merged = keras.layers.concatenate([encoded_video, enco encoded_video_question]) output = Dense(1000, activation='softmax')(merged) video_qa_model = Model(inputs=[video_input, video_question_input], outputs=output) 快速开始 26 3.3 Keras FAQ: 常见问题解答 3.3.1 Keras0 码力 | 257 页 | 1.19 MB | 1 年前3Machine Learning Pytorch Tutorial
Loss Function Optimization Algorithm More info about the training process in last year's lecture video. Training & Testing Neural Networks Validation Testing Training Guide for training/validation/testing ... torch.nn – Network Layers ● Linear Layer (Fully-connected Layer) ref: last year's lecture video torch.nn – Neural Network Layers ● Linear Layer (Fully-connected Layer) x2 x1 x3 x32 y2 y1 algorithms that adjust network parameters to reduce error. (See Adaptive Learning Rate lecture video) ● E.g. Stochastic Gradient Descent (SGD) torch.optim.SGD(model.parameters(), lr, momentum = 0)0 码力 | 48 页 | 584.86 KB | 1 年前3Lecture 1: Overview
Many words in a document, many, many documents available on the web. Image/Video Understanding Given an/a image/video, determine what objects it contains. Determine what semantics it contains Determine market conditions and other possible side information Predict the age of a viewer watching a given video on YouTube Predict the location in 3D space of a robot arm end effector, given control signals (torques)0 码力 | 57 页 | 2.41 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures
a) To compress the information content of high-dimensional concepts such as text, image, audio, video, etc. to a low-dimensional representation such as a fixed length vector of floating point numbers of training the model, it is agnostic to what the embedding is for (a piece of text, audio, image, video, or some abstract concept). Here is a quick recipe to train embedding-based models: 1. Embedding directly use in your model. There are a large number of popular models across image, text, audio, and video domains that are ready-to-deploy. For instance, you should not spend resources and time training your0 码力 | 53 页 | 3.92 MB | 1 年前3机器学习课程-温州大学-14深度学习-Vision Transformer (ViT)
com/video/BV18Q4y1o7NY 3. Dosovitskiy. An image is worth 16×16 words: transformers for image recognition at scale. In ICLR. 4. 唐宇迪, https://www.bilibili.com/ 5. https://www.bilibili.com/video/BV1Uu411o7oY0 码力 | 34 页 | 2.78 MB | 1 年前3阿里云上深度学习建模实践-程孟力
SmoothL1 DiceLoss Contrasive RCNNHead MaskHead SeqHead Vit Swin Retrieval Image Generation Video Caption EasyVision: 图像视频算法库 Bert TextInput Optim izer 性能优越: 分布式存储 分布式查询 功能完备: GSL/负采样 swin-transformer based moco. Image features 推荐模型特征 图像搜索 解决方案: 多模态预训练 Swin transformer based (Violet) VIT Video Fram es Bert Title OCR Cls Tok en Title feature OCR feature Im age feature M HSA Fusion0 码力 | 40 页 | 8.51 MB | 1 年前3keras tutorial
becoming more popular in data science fields like robotics, artificial intelligence(AI), audio & video recognition and image recognition. Artificial neural network is the core of deep learning methodologies Convolutional neural network is one of the most popular ANN. It is widely used in the fields of image and video recognition. It is based on the concept of convolution, a mathematical concept. It is almost similar0 码力 | 98 页 | 1.57 MB | 1 年前3【PyTorch深度学习-龙龙老师】-测试版202112
FCN、U-net、PSPNet、DeepLab 系列等。 预览版202112 1.4 深度学习应用 11 图 1.15 目标检测效果图 图 1.16 语义分割效果图 视频理解(Video Understanding) 随着深度学习在 2D 图片的相关任务上取得较好的效 果,具有时间维度信息的 3D 视频理解任务受到越来越多的关注。常见的视频理解任务有 视频分类、行为检测、视频主体抽取等。常用的模型有 预览版202112 第 14 章 强化学习 6 loss = -log_prob * R with tape.stop_recording(): # 优化策略网络 grads = tape.gradient(loss, self.trainable_variables)0 码力 | 439 页 | 29.91 MB | 1 年前3
共 12 条
- 1
- 2